Friday, July 24, 2009

Audit Trail Logging: Transparently Tracking Data Changes in Persistence Layer

Audit is a process that can apply many fields of an enterprise including information systems. Whether it is internal (Audit Department) or external (Outsourced), audit is important to control people, infrastructure and third party relations. Many business processes are carried out by transaction processing systems with a database backend. To track and spot the processes, we need a change history for every transaction. In enterprise applications, this is possible with data change logging. Especially, ERP applications have a built-in audit trail logging feature.

Trail logging has many benefits:
1- It helps audit personnel to detect fraud. It enables us to detect unauthorized or unapproved data manipulation. It only helps because fraud can be possible by legal and normal records if nobody sufficiently controls the business transactions.

2- It provides data change non-repudiation. That means when a user changes a record and claim he/she didn’t change the record, then its validity is precisely determined via inspecting audit trailer logs. Sometimes users may blame the applications for transaction leaks, this is a very nice tool for application developers to justify the right side.

3- Users also use this facility. When a user notices that his/her inserted records are changed, by looking audit trails he can understand who changed the records.

4- Developers also use it to determine abnormal application behavior. An application may inadvertently alter some records and this can be easily debugged if an audit trail is present.

There are 2 types of audit trail logging technique:

Audit Trailing on Table
You add some meta-columns to your tables for data change information like UserInserted, UserInsertDate, UserUpdated, UserUpdateDate.

Easy to record trail data
Audit data size is small
Easy to fetch trail data with SQL statements

Only one update trail data can be stored
Can’t store changed data
Data and row meta-data(trail data) is merged

Audit Trailing on External Table
An external table may be used to keep change logs. Every successful transaction should be logged to this table. This table may be taken to another database or split to multiple tables to ease management.

Insert and every update can be logged
Changed data can be logged
Separated form the transactional data
By adding a transaction id to logs (Log table has a transaction id column), it is possible to see every change in a transaction (This is a shining feature when debugging transaction boundaries).

Hard to fetch trail data with SQL
Trail data is bigger since PK columns and table names are also stored within every row.

We used second technique within our ORM Framework (This feature is still missing or hard to use in many other ORM frameworks). We implemented it as a transparent service plugged to persistence layer so that developers don’t have to write any log code or SQL. By the nature of a service, it can be closed and opened when necessary without system restart. It works asynchronously so that application doesn’t wait logging process. It is independent from any DBMS vendor. We named it as “Database Logging Service”.

Architecture of database logging service is as following:


Integrated Auditing of ERP Systems

Auditing Information Systems: A Comprehensive Reference Guide

Auditing Information Systems


autrepack said...

About solution #2:
- should not it be blocking by default or, even better, offer the option to be blocking, in order to avoid operations being performed without audit information being logged?
- if you have a lot of rows and updates (1million or more), would not it be worth to save some space by adding another table storing table/columns/PK names? An alternative solution to that can be to remove a part of the content of the audit table(s) by moving it to another db or dump it to offline files.

Ibrahim Levent said...

If you mean synchronous process by saying "blocking", we have 2 features for disadvantages of asynchronous process:
1- Our audit log API reads "to-be logged cache" when audit log is tried to read. Sometimes, log is read just after the save operation and asynchronous process may not be finished at that time.
2- Asynchronous logging runs if and only if transaction succeeds.

For bulk update, we generally use batch SQL in applications, and when doing so only one row audit log is saved. Otherwise performance of applications degrades a lot. Yeah, splitting audit table and moving it to another database may be considered for better space management.

Haroon idrees said...

Good article
I have one question regarding erps.some erp provides a feature that changes will incorporate after manager/supervisor approval on entities.if we update any record it will incorporation with functional transaction/report after approval authority.Can we get handle this case with audit trail.

Elhoim said...

About #1, what i mean is to offer the option of being synchronous in order to be wrapped in a transaction, for cases where it is needed to have both the log and the change committed or neither.

About #2, what i meant is that for the case of "audit trail on external table", if there are millions of audit lines, would not it be worth (space-wise) to use another table storing the table(s) name(s) for the audit rows instead of having it stored with each row.
Or is it just slowing everyting to add that level of indirection?

Ibrahim Levent said...

"Change commit after manager approval" could be implemented with audit detailed trails(if audit trail contains column changes). One another usage may be cancelling the current process, user could rollback last transaction.

Unknown said...

I like to use pub-sub messaging for this...

Related post