July 5, 1998

Today's issue deals with online information from Banking to the more general.issues.

Online Banking and Data

I've been doing my accounting online for years. I use a number of different banking programs and services. but mainly to transfer data between my system and the service providers. MS Money and Quicken are good for transferring payment data and "downloading" information from the providers. I also use web-based access as with American Express to capture the data for later processing.

I bring all of this into my own database which has evolved since I first started keeping my checkbook online since 1974. I currently use Microsoft Access for this database.

While I manage to do this and the increasing availability of data online makes managing the information easier than when I had to rekey it each month. But the trends are not in the right direction. While there has been a movement towards protocols for better data transfer between the banks and the user's system, the web offers the banks the opportunities to shift their focus to the user's eyeballs rather than the programs running on the PC.

For example, BankBoston has come out with an update to their weak Homelink program with one that has a glitzy user interface. But the old program was a simple character oriented program that I could use to simply download the transaction data. The new program brings back the old days of time sharing system by giving me a terminal. Now, I must go through a myriad of keystrokes to bring the data into my system. Of course, that's not their intent. They expect me to sit there banking online just like I did with terminals 30 years ago. And just as slowly. As an aside, they also expect me to dial into their "Intranet", an action which breaks the connection to my own IP network and to the Internet itself. They promise a more Internet-friendly version. We'll see.

Another downhill example is American Express. They offered a service on AOL which provided much more data. It wasn't really their intent since the actual download software was clunky and reduced the data to that which was a subset of what Quicken used. But the text version was parseable and contained the same information as on the printed statement including the city pairs for airline tickets, currency information for international payments etc. But the new web-based service has removed all this important information. This makes it difficult to manage travel expenses and other complex information.

Another problem with online banking is that it generally forgets the concept of a statement. I'm supposed to compare the printed statement with the information in Money or Quicken and reconcile it myself. This is, of course, pure stupidity and a total waste of time. With American Express and CompuServe's Visa, I do get statement identity which makes it easier to reconcile.

But I still don't have transaction identifiers. One must instead infer the identify of transactions from clues. The most important clue is the amount. It is unlikely I'll have two transactions for $27.96 in a short period of time. Unfortunately, regular payments and those who roundoff to $20.00's makes this more difficult. It would be so helpful to have identifiers associated with each transaction so I can compare the information from various sources including transfers between my own accounts.


Which brings us to the more general topic of data. One problem is that I want to be able to use my computer as an intermediary between me and my information source. I want the two to negotiate directly and then I can have my local program present the information to me.

There is some hope if the financial exchange protocols are adopted so that I can have my local program act on my behalf.

There is also home for XML to serve as a general tagging language to encourage this trend. V-card and V-Calendar formats also offer hope for providing data.

But all to often the information on the web is provided in text or even scanned image formats. For example, the MBTA (Boston's Public Transportation System) provides maps and text schedules rather than data.

This is also painful for event data -- one can't connect one's schedules to the data sources.

Data Synchronization

There's also the closely related problem of synchronizing data from various sources. I also maintain my main address book in Access but need to coordinate with Outlook, my Pilot, my laptop, other machines at home, PlanetAll, my phones and other devices.

But there is no standard way to determine if two entries represent the same data let alone synchronizing changes. Lotus Notes comes closest. It assigns a unique ID to a data element (at least, relative to a containing data table). Any change adds a UID for a change history. If two instances are found their history can be compared. If one is a successor to the other, then it dominates. Otherwise there is a synchronization error and the two versions are maintained until someone can manually reconcile them and delete a branch or the branch can be simply be spawned as a separate element. Of course a deletion is really just another event so that it can be propagated.

This is a very simple but powerful idea. It can be applied at a number of levels so that one can maintain fields or whole records. It can also provide a definition of same -- the UID is the same.

Thus, one needn't resort to complex algorithms to determine if two John Smiths are the same -- just rely on the UID. This doesn't mean that one gives up but the problem of determining identity is separate from the actual operations of reconciliation.

The use of UIDs allows elements to be related thus one doesn't need to create a single complex glob, as Outlook does, in order to store addresses and names. One can keep the elements separate so that, for example, household members can share an address or one person can have multiple addresses.

The use of unique IDs and change histories is very simple, perhaps too simple. There is a tendency to use complex mechanisms when simple solutions can function much better.

This is also starting to get to other uses of Unique IDs as intermediaries for other information such as access control but that's a topic for another day.


You might have noticed that I put the word "Downloading" in quotes. That's because it is a notion left over from the old days of timesharing. Instead, the model should be shared and synchronized data. I shouldn't really have to download my data from the bank. Instead, we should share ownership and management. In reality, especially in banking, we don't really share ownership but synchronization is still a much better model than downloading.