In an earlier post Time After Time: Temporal Modeling for Sql Audits and Historic Views I discussed the concept of Valid-Time tables. This is a powerful technique that can be used for many different design goals. To further clarify the idea, here is a very basic example demonstrating the use of an Valid-Time audit history table to track all modifications to the Orders table from the famous Microsoft Sql Server Northwind sample database.
Note that as of SQL Server 2016 this technique is natively supported as ANSI SQL 2011 Temporal Tables [microsoft.com].
The simplest design of the Valid Time table for auditing Orders will contain all the fields in the base Orders table plus two extra fields. One datetime field is required to indicate when the other information in the record started being valid and another datetime field is required to indicate when this information stopped being valid. For this example the new table audit_Orders and these two extra fields audit_StartDateTime and audit_EndDateTime. All other fields in the audit table should match the base table in name, size and datatype. Identity fields in Microsoft SQL Server must also lose their autoincrement property and be converted back to their base data type. In our example this means the OrderID must be converted to a regular 4 byte integer field with no identity set. Although all the fields in the Orders table can be audited, Memo (text) or binary fields are not fully supported in Microsoft SQL Server triggers and must be excluded. Regular varchar fields in older versions of Microsoft SQL Server fields can hold up to 8000 characters and with the new varchar(max) datatype this is not a significant restriction for most transactional processing.
Out of the box, the Northwind Orders table has this structure:
Field Name | Size | Type |
---|---|---|
OrderID | 4 | Identity |
CustomerID | 5 | WChar |
EmployeeID | 4 | Integer |
OrderDate | 16 | DBTimeStamp |
RequiredDate | 16 | DBTimeStamp |
ShippedDate | 16 | DBTimeStamp |
ShipVia | 4 | Integer |
Freight | 8 | Currency |
ShipName | 40 | VarWChar |
ShipAddress | 60 | VarWChar |
ShipCity | 15 | VarWChar |
ShipRegion | 15 | VarWChar |
ShipPostalCode | 10 | VarWChar |
ShipCountry | 15 | VarWChar |
For the purposes of this example, in addition to the required Start and End fields let’s add fields for the user, application, and hostname associated with each of these two events. Along with these let’s also add a new arbitrary Identity field to aid in ad hoc retrieval. Here then is the structure of the table audit_Orders:
Field Name | Size | Type |
---|---|---|
OrderID | 4 | Integer |
CustomerID | 5 | WChar |
EmployeeID | 4 | Integer |
OrderDate | 16 | DBTimeStamp |
RequiredDate | 16 | DBTimeStamp |
ShippedDate | 16 | DBTimeStamp |
ShipVia | 4 | Integer |
Freight | 8 | Currency |
ShipName | 40 | VarWChar |
ShipAddress | 60 | VarWChar |
ShipCity | 15 | VarWChar |
ShipRegion | 15 | VarWChar |
ShipPostalCode | 10 | VarWChar |
ShipCountry | 15 | VarWChar |
audit_RecordID | 4 | Identity |
audit_StartDateTime | 16 | DBTimeStamp |
audit_EndDateTime | 16 | DBTimeStamp |
audit_StartUserName | 255 | VarChar |
audit_StartAppName | 255 | VarChar |
audit_StartHostName | 255 | VarChar |
audit_EndUserName | 255 | VarChar |
audit_EndAppName | 255 | VarChar |
audit_EndHostName | 255 | VarChar |
Now that OrderID is no longer an Identity field, the natural primary key for this Valid-Time table would be the original primary key (the 4 byte integer OrderID) plus either the audit_DateTimeStart or the audit_DateTimeEnd field. Although logically this key presents no conceptual problems, in implementation there is a potential conflict if updates occur simultaneously as represented by the underlying data type being used. SQL Server resolves Datetime fields to a resolution of about 3.33 milliseconds. Although this issue is unlikely to present itself in practice, this condition should be considered at least for its theoretical implications. For our auditing table purposes enforcing any sort of unique constraint on the table is not imperative, so audit_RecordID will be set as the primary key.
We already have data in our Orders table, so lets dump it into the new audit_Orders table indicating the starting point of the audit. This will be a complete point in time snapshot of the data as it existed before we started auditing it. All we need is the following INSERT INTO statement:
INSERT INTO
audit_Orders
(
OrderID,
CustomerID,
EmployeeID,
OrderDate,
RequiredDate,
ShippedDate,
ShipVia,
Freight,
ShipName,
ShipAddress,
ShipCity,
ShipRegion,
ShipPostalCode,
ShipCountry,
audit_startdatetime,
audit_enddatetime,
audit_startusername,
audit_startappname,
audit_starthostname
)
SELECT
OrderID,
CustomerID,
EmployeeID,
OrderDate,
RequiredDate,
ShippedDate,
ShipVia,
Freight,
ShipName,
ShipAddress,
ShipCity,
ShipRegion,
ShipPostalCode,
ShipCountry,
getdate() AS audit_startdatetime,
'9/9/9999' AS audit_enddatetime,
user_name() AS audit_startusername,
app_name() AS audit_startappname,
host_name() AS audit_starthostname
FROM Orders
This SQL statement is pretty self-explanatory except perhaps for assigning audit_enddatetime the value of ‘9/9/9999’. Obviously this is an arbitrary date to choose for marking the data as “currently valid”. We could code these end dates as Null or unknown and perhaps satisfy some deeper design aesthetic. However in practice using Null in the design would make constructing optimizable queries on the table more difficult. Since SQL Server datetime fields themselves run after December 31, 9999 we will ignore our own reinforcement of this Y10K problem in favor of enabling audit queries that are easy to construct and optimize.