Just The Facts

I’ve spent the last few days covering dimension tables, but I should briefly mention that there are three types of fact tables as well. The first is the kind most will be familiar with, a transactional fact.

These are straightforward, every time an event occurs in your transaction system, a copy of that event is made in the fact table. Someone sells a product, or updates an inventory record, or some similar event occurs it’s reflected in your transaction fact table. Predominately your warehouse will be made up of these types of tables.

Often times your users will want to be quickly able to reference data that’s aggregated over time; sales to date, sales per week, and so forth. To make getting to these totals fast, the second and third types of fact tables are known as snapshot tables. In addition to speeding the return of data to the user, it also reduces the load on the server as totals are aggregated at one time instead of each time the user runs a report.

The first of the snapshots (which is also the second type of fact table) is called a periodic snapshot fact table. At some defined point in time, a job will run that will aggregate the transactional fact data and place it into the periodic snapshot fact table. Perhaps every Saturday night at midnight a job runs which tallies up the previous weeks sales into a sales for the week fact table. Another classic example is the month ending totals after accounting closes out the business month.

The other type of snapshot, and the third type of fact table, is an accumulating snapshot. An accumulating snapshot is constantly being updated, instead of waiting for some point in time. Perhaps you have a trigger on the transaction sales table. Whenever a sale is made, that trigger updates the sales to date table.

Accumulating snapshots have the advantage of always being up to date and accurate. If you have a lot of them, or the transaction tables that they are based on have frequent updates, they can become a major drain on your system resources. For that reason it’s best to limit, to use them with transaction tables that are seldom updated. Let’s take two examples to help illustrate.

In an average car dealership, I’d guess that they might sell four cars a day. Since the transaction volume is so low, four a day, an accumulating snapshot would be entirely reasonable to track sales to date.

On the other hand, let’s look at a large site like woot.com, that deals with hundreds if not thousands of sales in a very short time span. In their case, an accumulating snapshot would be a major drain on system resources. Instead they would be much better served by a periodic snapshot.

A periodic snapshot, by the way, isn’t limited to a once a week or once a month situation, even though that’s how they are primarily used. You could update your periodic snapshots once a day, once an hour, or even once every 10 minutes if your server could support it. The “periodic” simply means that it runs on a regular schedule rather than constantly.

In addition to your transactional fact data, you should review your reporting needs to see if either of the two types of snapshots, periodic or accumulating, could be of use in your warehouse to speed data reporting and reduce the load on your servers.

Advertisement

One thought on “Just The Facts

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s