Automating Personal Finance Monitoring, Part 1

Published in PHP, Web Scraping on May 14, 2021

Anyone who's known me a while knows that I'm not really happy with the status quo when it comes to accessibility of personal financial information.

I've often eschewed services like Mint in the past: while their interfaces may be helpful when it comes to examining trends or summarizing your overall financial position, I don't find them so useful for day-to-day management of finances or working toward financial goals.

There are many things that I like about Ledger, but its accessibility to those not comfortable with a shell (e.g. my wife) is limited and it requires a substantial amount of data entry to be be optimally useful.

This data already exists within the financial institutions that I patronize; I just need a way of getting it out. Services like Plaid are good in concept, but I haven't found many solutions in that space, and the few I have seen have experienced relatively little adoption -- meaning that the number of financial institutions they work with are limited -- or generally target businesses rather than individual consumers.

In theory, I could log into each of my accounts everyday, download CSV dumps or copy and paste recent transactions from each one, compile them into a table, and reconcile them. However, that would be excessively time-consuming, tedious, error-prone, and pretty dull work. I wanted a better solution. So, I set about writing one.

Approach

The idea of automating the manual process I described above seemed a relatively solid way to approach the problem.

My cursory examination of the sites with which I'd be interacting suggested that using a low-level HTTP client like curl wasn't going to be sufficient for my use case: basic interactions with these sites involved a lot of complex client-side code and HTTP requests that I didn't particularly want to pull apart and try to reproduce reliably.

The alternative, automating use of an actual browser, arguably amounts to doing what I or your average user of these sites would normally do manually and thus seemed like a better way to go.

Stack

I started by installing symfony/panther. If you have my book, you can find out more about this at the end of chapter 14. I supplemented this with symfony/css-selector to allow me to filter DOM nodes using CSS selectors, which tend to be more concise than XPath expressions; see chapter 13 and the start of chapter 14 for more details.

I also followed the recommendation of installing dbrekelmans/bdi to handle downloading and installing chromedriver and set it up as a Composer post-install-cmd script.

/* composer.json */
{
    "require": {
        "dbrekelmans/bdi": "^0.3.0",
        "symfony/css-selector": "^5.2",
        "symfony/panther": "^1.0"
    },  
    "scripts": {
        "post-install-cmd": "bdi driver:chromedriver drivers"
    }   
}

Finally, I started writing code to log into one of the sites I would be extracting data from... and hit my first blocker.

Impersonation

The behavior of the target sites changed when I tried to use an automated browser rather than a manual one: login requests would fail.

I used Chrome Developer Tools to download a request sequence as a HAR file. Combing through it, I noticed a reference to webdriver in one of the request payloads.

Additional digging turned up a related requirement in the WebDriver standard and a work-around for chromedriver (bless Stack Overflow), which I incorporated into my project.

/* composer.json */
{
    "scripts": {
        "post-install-cmd": "./scripts/post-install-cmd.sh"
    }
}

# ./scripts/post-install-cmd.sh

#!/bin/bash
./vendor/bin/bdi driver:chromedriver drivers
sed -i 's/cdc_/dog_/g' drivers/chromedriver

This allowed me to get past the web application firewalls of most sites. However, there were a few for which this wasn't sufficient. More research turned up another measure (again, bless Stack Overflow). This one was simpler and involved passing a CLI flag to chromedriver when invoking it.

$client = \Symfony\Component\Panther\Client::createChromeClient(
  null,
  ['--disable-blink-features=AutomationControlled']
);

Apparently there are numerous such ways of fingerprinting an automated browser and, depending on the site, it can be difficult or next to impossible to find the right ones. Guess I just got lucky. Well, mostly...

Authentication

Once I got past the login form, I had to contend with other authentication measures.

One common such measure was security questions. These are typically few in number and configured through account settings ahead of time. It was easy enough to detect them, cross-reference the given question with a pre-programmed set of questions and corresponding answers, and bypass them.

The next such measure was more complicated: two-factor authentication, or 2FA. Some sites offer e-mail as an option for 2FA, but many only offer SMS. So I had to figure out how to bypass both cases.

For e-mail, I wanted messages used for 2FA to be as short-lived as possible: my script would log in to the relevant web site and initiate 2FA, then check an e-mail account, locate the message, extract the 2FA code from it, and delete the message.

At first, I tried using Panther to access Gmail via its web UI. Unfortunately, Google's fingerprinting is fairly aggressive, and I wasn't able to figure out a quick way to bypass their login form.

Fortunately, I found a different way: a new Gmail account with "Less secure app access" (air quotes intentional) turned on that I would access using the imap extension together with the ddeboer/imap library. The latter made it possible to access messages with an object-oriented API and very little code.

This took care of the e-mail case, but I didn't know of an easy way to have my script automatically check text messages. It's possible I could've used something like Twilio for this, but that seemed like overkill. I also didn't want to lose the ability to receive 2FA text messages on my phone when I was trying to log into my account manually.

2FA messages typically come from a dedicated number or have common identifying text in the content. The solution I ended up going with was using an app similar to this one to automatically forward SMS messages meeting specific criteria to my new Gmail account, which my script would then access just as it did for institutions using e-mail for 2FA.

Alerts

I can get away with only checking some accounts once per day. However, many financial institutions offer e-mail alerts for specific conditions, such as withdrawals, deposits, or transfers above a certain amount. This is useful for knowing when accounts need to be checked for changes.

In cases where multiple e-mail addresses were supported, I set up alerts using my new e-mail account. In cases where only a single or primary e-mail address was supported for alerts, I set up filters and forwarding to send specific alerts to the new account.

At that point, I was able to write a script that checks the new e-mail account every few minutes, identifies any new messages instigated by alerts and their sources, and checks those sources for changes to balances or recent activity.

Legwork

With the more challenging aspects behind me, all that remained was writing code to navigate through web interfaces; bypass any prompts, modals, or special offers; and extract and parse current account balances and dates, descriptions, amounts, and (where available) running balances from recent activity postings.

I still need to write code to store this data and reconcile it, but that's a subject for another post. Hope you enjoyed hearing about my experiences with this project. Thanks for reading!