Capturing who saw what in SQL Server

It’s important sometimes to know and have evidence of who-saw-what on specific SQL Server instances and databases. This can be a requirement you have to provide (e.g. to comply with HIPAA regulations) and there are several ways to ensure the capture of that information

In order to capture who saw what in SQL Server, it is important to know what was executed on the server. Even though ApexSQL Log (a transaction log reader) and ApexSQL Trigger (trigger-based auditing tool) provide partial information, in order to get the full details on who saw what, ApexSQL Audit is the perfect tool for the job.

First let’s take a look how to manually capture this information. There are several ways to do that – one is through the use of specially developed stored procedures and functions. A database can be created so that all the access is done through stored procedures (allowing EXEC statements only, while prohibiting all queries and DML operations). The stored procedure returns the results from a query while at the same time logging the access in a specialized table along with additional information of interest (e.g. time of execution) if applicable

Similarly this can be done for DML operations using stored procedures, but these are more elegantly handled via triggers, which can be automatically created by a product like ApexSQL Trigger

Anyway, these methods are fragile as they require disciplined programming ensuring that all code execution paths are covered and more importantly, they don’t provide a solution for ad hoc queries executed by trusted parties

SQL Server itself also offers a way to collect and monitor a wide range of events via SQL Server traces. Events related to the execution of queries, changes made to permissions, failed login attempts and other similar data are among them. SQL Server Profiler is an application that monitors all such events, but this application on its own is not an out-of-the-box solution for auditing. There are complex obstacles in the way of using it in an easy manner: everything has to be done manually, figuring out and capturing events, exporting and presenting the data, etc. and it still doesn’t solve the problem of trusted parties

How we capture who-saw-what data

ApexSQL Audit was developed to automate capturing, among other things, of who-saw-what data, and make compliance with auditing regulations easy. Tool is based on leveraging the SQL Server traces, the same technology used by SQL Server Profile, but, on top of it, the fault tolerance for a wide variety of scenarios is provided, as is additional processing of captured data, parsing for referenced objects, automatic transfers and loading into centralized repository, tamper-evident data storage and so on

ApexSQL Audit captures the following operations (there are more than 230 operations recognized by our parser) divided into several categories:

  • DDL – CREATE, ALTER and DROP statements for all object types as well as rename and other similar operations
  • DML – INSERT, INSERT BULK, SELECT INTO, WRITETEXT, UPDATE, MERGE, UPDATETEXT, DELETE and TRUNCATE TABLE
  • Before – after – full audit trail including data as it existed before and after a change
  • Query – SELECT, READTEXT and output parameters from EXEC
  • Execute – calls to extended stored procedures, stored procedures, CLR procedures and functions
  • Error – all operations that are issued as errors by SQL Server
  • Warning – all operations that are issued as warnings by SQL Server
  • Security – all operations that affect security of a SQL Server or one of its databases

The central ApexSQL Audit instance configuration dialog uses these categories to filter specific operations, and these can then be filtered even further on the object level for tables, views and other objects. Even more filtering, of course, is provided within the ApexSQL Audit’s reporting system

During the parsing process, we distinguish the difference between operations modifying or accessing objects. For example, the following statement will be captured, parsed and logged both in the DML and Query categories:

INSERT INTO Table1
VALUES(x1, x2); SELECT (f1, f2) from Table2

The INSERT modifies Table1 while the SELECT accesses Table2. So this information is saved separately for later use in the reporting system, explicitly showing which object was modified and which one was accessed

The captured data may originate from the server hosting the central ApexSQL Audit instance or from other servers monitored by distributed ApexSQL Audit instances. In case of remote servers, the captured data has to be transferred automatically and safely to the central instance which then automatically loads it into ApexSQL Audit’s Central Repository Database. Transfers are initiated by the central instance contacting distributed instances and requesting the packages with captured data. If they exist, they are transferred via network and stored to a local folder of the central instance and from there read and data in them stored into the central repository database by the central instance itself

The transfer operation is repeated in an endless loop – transfer requests are constant while packages with captured data exist on the distributed instance; otherwise, if there are no available packages, the requests are resent every 3 seconds

To avoid dependency on the security level of the connection between the central and distributed instances, we use hash checks to validate the success of the transfer. If anything goes wrong, for example during the transfer or while the data is being saved, a new request will be issued for the same remote data package. The remote package file on the distributed instance is not deleted until it is successfully transferred which the central instance has to explicitly confirm

In case of multiple distributed instances, multiple requests are sent at the same time accordingly, optimizing performance by enabling faster transfer – slow transfer performance of one distributed instance won’t affect other transfers. Also, remote packages are compressed during the transfer to additionally improve performance. Currently, the maximum number of simultaneous connections is 16

To summarize, ApexSQL Audit relies on SQL Server traces. It configures traces based on user configuration set through GUI or uses the default configuration which is a set of common auditing requirements with optimal usage of the Central Repository Database. By default, who-saw-what queries (SELECT statements) are not captured because this category of statement is usually orders of magnitude more frequent than any other and its capturing would cause huge amounts of data being stored in the central repository database by default, whether it is needed or not. Of course, this is easily configurable through GUI

By default, ApexSQL Audit captures data modification statements, schema modification statements, executions of stored procedures, functions and dynamic statements, security changes and login activities

Finally, to show this in an example, the following SQL statement:

SELECT
       CustomerID
       , InvoiceDate
  FROM INVOICES WHERE Amount > 200;

will be captured and shown in the Complete audit history report (among others) of the ApexSQL Audit reporting system as:

Complete audit history report

June 11, 2013