The Complete History of Everything

So let’s say you’ve got a database. It holds your recipes or bird-watching notes or juggling instructions or client’s bank details or the solubility of different types of gravel.

You’ve put this in a database because you’re obviously fascinated in whatever this thing is, and you need to look it up in less than a thousandth of a second at a moment’s notice, with what most people would consider autistic levels of precision.

Wouldn’t it be great to view the history of all the changes to that database ? That’d be at least twice as exciting as the raw data itself. 2.5x as exciting. Maybe even 3x.

Well now[1] you can.

Most databases allow you to set triggers so that operations on a database[2] will execute code on the server.

I’m a Java guy, so I’d rather not implement application logic in PL/SQL, T/SQL, or whatever they call it in whatever vendor’s database system you use. I consider things that ‘magically’ happen as a side-effect of something previously obvious as anathema to being able to manage a functioning system, which is essentially all I do all day. So anything that takes an UPDATE and magically converts every surname containing an apostrophe to something that doesn’t contain an apostrophe[3] can go burn in hellfire, but if you’re keeping history tables up-to-date, then that’s just fine with me.

A history table has pretty much the same structure as the table you’re keeping a history of, with at least three extra columns:

  • Trigger ID
  • Trigger timestamp
  • Action

The trigger ID is populated by a sequence generator and will be used as the primary key of the table. The trigger timestamp is the time an action was performed, and the action is a CHAR(1) containing what type of action was performed (‘D’=DELETE, ‘U’=UPDATE, and ‘I’=INSERT). You’re not going to audit SELECTs on the table because you’ll run out of disk space, unless you’re the government.

So let’s say you’ve got a table of active users (table A):

tblUser

ID Name Enabled Administrator
1 King Tut Y Y
2 Prince Admonish Y N
3 Princess Chide Y N
4 High Priest Kevin Y N

and you make a few edits over the course of a fascinating day of system administration, so that it looks like this (table B):

tblUser

ID Name Enabled Administrator
1 King Tut Y Y
2 Prince Admonish N N
3 Princess Chide Y N
5 Trevor the stonemason Y N

Your history table would then list which actions were performed to get you from table A to table B, and might look like this:

tblUserHistory

Trigger
ID
Trigger
Time
Trigger
Action
ID Name Enabled Administrator
101 2020-06-06 11:26:00 D 4 High Priest Kevin Y N
102 2020-06-06 13:34:00 U 2 Prince Admonish N N
103 2020-06-06 13:45:00 I 5 Trevor the stonemason Y N

Which gives you sufficient information to unwind these operations back to various points in time, which will be handy when the King of the Hittites or some pale-faced teenager decides to wipe the thing and replace your data with ‘HAXXORS RULEZZ’ or something.

If you want to add a couple more columns, you could even allow your web application to have an undo function, which will provide your responsive cloud-hosted enterprise management rules engine with the same level of functionality that MS Paint had in 1996.

Anyway. Here’s some code that you can point at a database, and it’ll create these history tables for you, and the triggers to populate it[4]. Bam. Instant data warehousing. Keep it running for a week, see which numbers change most frequently, and you can say you’re using Big Data and dynamically responding to customer expectations or something.

You’ll note if you run this that there’s actually two columns created for each column in the source row; one that contains the actual data and a bitfield which is set to 1 if the value has changed. You could probably put in delta encoding for CLOBs or something else if you really felt like it.

Run it by doing this on a command-line.

java -jar historytable-cli-with-dependencies.jar
--jdbc jdbc:mysql://localhost/datatype-dev
--username root --password abc123

And then change a few values in some arbitrary tables and watch history being created.

The source

So I’ve finally got with the program and started using git, so you can now find the source on github here:

There’s should be a jar there which includes all the sub-dependencies.

[1] And by now, I mean any time from about 20 years ago or so. 25 if you include the time this has been sitting in my draft folder in WordPress.
[2] SELECT, UPDATE, INSERT, DELETE for people who use SQL, and hand-waving bullshit for people who use other RDBMS systems
[3] So that you can keep your SQL-injection vectors ticking over
[4] Linking this to a table of UserActions, and maintaining these tables and triggers as the database evolves I will leave as an Exercise to the Reader.

Add a Comment

Your email address will not be published. Required fields are marked *