Continuous auditing of SQL Server database using the transaction log

Challenge

Imagine if you could decrypt and pump all the wealth of transactional information and details contained in the SQL Server transaction log, in human readable format, and then store this as data directly in SQL Server tables, ensuring it wasn’t lost when the log was truncated, and such that data could be queried, reported on etc. using T-SQL.

Ideally, this would be set up to run unattended to continuously to create a seamless audit trail with no gaps in the data or duplicated records.

Solution

In this article, we’ll show you how to set up a typical auditing job via ApexSQL Log, a 3rd party tool for reading SQL Server transaction logs, build repository tables for storing the audited data, then show how multiple executions will provide a continuous auditing trail with no gaps or duplicate data via a reporting proc. Finally, we’ll automate the process to run nightly by create a .bat file to run it unattended and scheduling it.

Description

To create the continuous auditing job in ApexSQL Log, the following steps should be executed:

  1. Start ApexSQL Log and click on the New button to start the fresh session

  2. In the Database connection dialog, select SQL Server and database and provide connection details and click Next to advance to the next step

    Note: Before providing sources to audit in the Data sources step of the wizard, it is important to keep in mind that it is often the case that the full, differential or transaction log backups are being taken on pre scheduled basis, and that unscheduled backups can be created at any point in time in the future. Also, while some backups are being taken into already existing files, the backups that will be created into the completely new files must be taken into consideration. For this purpose, it is important to keep in mind backup naming policies and to use this knowledge to setup backup-naming patterns in ApexSQL Log – this will ensure that there is no need to edit the batch file every time new backup file is created during the backup job.

  3. In the Data sources step of the wizard, click on the Add pattern and use wildcards to specify pattern for the backup files, as well as the location of folder backups are stored in. With this, ApexSQL Log will automatically recognize all files that match the pattern and include them in the auditing process automatically.

  4. Click Next to advance to the Select output step of the wizard and choose the Export results option

  5. In the Filter setup step of the wizard, on the Time range tab, tick the Continuous auditing radio button and choose/specify tracking file that will be used to hold the information on last LSN. I’ll discuss this feature more later in the article

  6. Before proceeding to the final step of the job creation, use various filters in the Filter setup step of the wizard to fine-tune the auditing results for additional value. When done, click Next to advance
  7. In the Export step of the wizard, choose the Export to database output type and select the repository database to store the auditing results. You can create a dedicated database to store the auditing data, use the database you are auditing, or use an existing repository, depending on what suits you best

  8. Finally, to complete the job creation, click on the Save button and choose the name for the job. Additionally, the job can be immediately executed by clicking on the Finish button in the final step of the session wizard.

Repository database

Now that we’ve created a continuous export to database job, let’s quickly look into the database itself. Virtually any database can be used as a repository for auditing, as was previously mentioned, one can use specific database, database on the remote server, or even the database that is being audited.

To prepare the database to accept the audited data, two specific tables must be created in the database – APEXSQL_LOG_OPERATION and APEXSQL_LOG_OPERATION_DETAIL. Below are the scripts which will allow manual creation of both databases that will embrace and store all audited data.

Note: Even though both tables can be created manually, if these specific tables do not exist in the database chosen as repository in the job setup when the job is started, they will be automatically created during the job execution and accommodate the auditing data.

The APEXSQL_LOG_OPERATION table holds the information on the operation itself. This resembled the GUI overview from grid and the information on the operation type, schema, begin/end time etc.

Here is a table script and structure for APEXSQL_LOG_OPERATION table:

CREATE TABLE [APEXSQL_LOG_OPERATION] (
	[LSN] [CHAR](22) NOT NULL,
	[OPERATION_TYPE] [VARCHAR](128) NULL,
	[OBJECT_NAME] [NVARCHAR](128) NULL,
	[USER_NAME] [NVARCHAR](128) NULL,
	[TRANSACTION_ID] [CHAR](13) NULL,
	[TRANSACTION_BEGIN] [DATETIME] NULL,
	[TRANSACTION_END] [DATETIME] NULL,
	[TRANSACTION_DESCRIPTION] [NVARCHAR](500) NULL,
	[ROW_RECONSTRUCTED] [BIT] NULL,
	[PAGE_ID] [CHAR](13) NULL,
	[SLOT_ID] [INT] NULL,
	[ID_KEY] [NVARCHAR](1000) NULL,
	[SPID] SMALLINT NULL,
	[SERVER] [NVARCHAR](128) NULL,
	[DATABASE] [NVARCHAR](128) NULL,
	[TRANSACTION_STATE] [VARCHAR](10) NULL,
	[SCHEMA_NAME] [NVARCHAR](128) NULL,
	[ROW_PARTIALLY_RECONSTRUCTED] [BIT] NULL,
	[ROW_ORDINAL] [SMALLINT] NOT NULL,
	[PARENT_SCHEMA_NAME] [NVARCHAR](128) NULL,
	[PARENT_OBJECT_NAME] [NVARCHAR](128) NULL,
	[PREVIOUS_LSN] [CHAR](22) NULL,
	[DURATION] [FLOAT] NULL,
	[LOGON_TIME] [DATETIME] NULL,
	CONSTRAINT [PK_APEXSQL_LOG_OPERATION] PRIMARY KEY CLUSTERED (
		[LSN]
		[ROW_ORDINAL]
		) ON [PRIMARY]
	) ON [PRIMARY] 
END

The APEXSQL_LOG_OPERATION_DETAIL table holds information on the operation details, which resembles the Operation details from GUI – the field, type and Before-after values.

Here is a table script and structure for APEXSQL_LOG_OPERATION_DETAIL table:

CREATE TABLE [APEXSQL_LOG_OPERATION_DETAIL] (
	[LSN] [CHAR](22) NOT NULL,
	[LINE_NO] [INT] NOT NULL,
	[COLUMN_NAME] [NVARCHAR](128) NULL,
	[COLUMN_TYPE] [NVARCHAR](128) NULL,
	[OLD_VALUE] [NTEXT] NULL,
	[NEW_VALUE] [NTEXT] NULL,
	CONSTRAINT [PK_APEXSQL_LOG_OPERATION_DETAIL] PRIMARY KEY CLUSTERED (
		[LSN],
		[LINE_NO]
		) ON [PRIMARY]
	) ON [PRIMARY] 
END

Extracting the audited data

The simplest way to extract the data from each table individually is to run a simple SELECT query against those particular tables

To query data from APEXSQL_LOG_OPERATION_DETAIL table, run the following query against your repository database:

SELECT [LSN]
	,[LINE_NO]
	,[COLUMN_NAME]
	,[COLUMN_TYPE]
	,[OLD_VALUE]
	,[NEW_VALUE]
FROM [dbo].[APEXSQL_LOG_OPERATION_DETAIL]

To query data from APEXSQL_LOG_OPERATION table, run the following query against your repository database:

SELECT [LSN]
	,[OPERATION_TYPE]
	,[OBJECT_NAME]
	,[USER_NAME]
	,[TRANSACTION_ID]
	,[TRANSACTION_BEGIN]
	,[TRANSACTION_END]
	,[TRANSACTION_DESCRIPTION]
	,[ROW_RECONSTRUCTED]
	,[PAGE_ID]
	,[SLOT_ID]
	,[ID_KEY]
	,[SPID]
	,[SERVER]
	,[DATABASE]
	,[TRANSACTION_STATE]
	,[SCHEMA_NAME]
	,[ROW_PARTIALLY_RECONSTRUCTED]
	,[ROW_ORDINAL]
	,[PARENT_SCHEMA_NAME]
	,[PARENT_OBJECT_NAME]
	,[PREVIOUS_LSN]
	,[DURATION]
	,[LOGON_TIME]
FROM [dbo].[APEXSQL_LOG_OPERATION]

The more complex way to extract the data which offers better insight into the audited data queries both tables together by joining up the LSN values in the table. Here is the query to run to get the paired details.

SELECT A.[LSN]
	,A.[OPERATION_TYPE]
	,A.[OBJECT_NAME]
	,A.[USER_NAME]
	,A.[TRANSACTION_ID]
	,A.[TRANSACTION_BEGIN]
	,A.[TRANSACTION_END]
	,A.[TRANSACTION_DESCRIPTION]
	,A.[ROW_RECONSTRUCTED]
	,A.[PAGE_ID]
	,A.[SLOT_ID]
	,A.[ID_KEY]
	,A.[SPID]
	,A.[SERVER]
	,A.[DATABASE]
	,A.[TRANSACTION_STATE]
	,A.[SCHEMA_NAME]
	,A.[ROW_PARTIALLY_RECONSTRUCTED]
	,A.[ROW_ORDINAL]
	,A.[PARENT_SCHEMA_NAME]
	,A.[PARENT_OBJECT_NAME]
	,A.[PREVIOUS_LSN]
	,A.[DURATION]
	,A.[LOGON_TIME]
	,B.[COLUMN_NAME]
	,B.[COLUMN_TYPE]
	,B.[OLD_VALUE]
	,B.[NEW_VALUE]
FROM [dbo].[APEXSQL_LOG_OPERATION] A
JOIN [dbo].[APEXSQL_LOG_OPERATION_DETAIL] B ON A.LSN = B.LSN

To simply demonstrate “the continuous” aspect of auditing, let’s take a look at the simple example:

  1. We have a newly created database with no structure or data.
    Let’s execute a simple script to create one table and insert some data

    CREATE TABLE [dbo].[Table001] (
    	[id] VARCHAR(11) CONSTRAINT [Pid] PRIMARY KEY
    	,[Name] VARCHAR(40) NOT NULL
    	,
    	)
    GO
    
    INSERT INTO [dbo].[Table001] (
    	[id]
    	,[Name]
    	)
    VALUES (
    	'100'
    	,'George'
    	)
    
    INSERT INTO [dbo].[Table001] (
    	[id]
    	,[Name]
    	)
    VALUES (
    	'101'
    	,'Nesha'
    	)
    
    INSERT INTO [dbo].[Table001] (
    	[id]
    	,[Name]
    	)
    VALUES (
    	'102'
    	,'Mark'
    	)
    
    INSERT INTO [dbo].[Table001] (
    	[id]
    	,[Name]
    	)
    VALUES (
    	'103'
    	,'Jack'
    	)
    
    INSERT INTO [dbo].[Table001] (
    	[id]
    	,[Name]
    	)
    VALUES (
    	'104'
    	,'Bear'
    	)
    GO
    
  2. Now, let’s create auditing tables and execute the continuous auditing job as shown above. We’ll get the following results when we inspect our tables

  3. Now, let’s make some additional changes to our test table by executing another simple script to update some of the previously inserted rows

    UPDATE [dbo].[Table001]
    SET [Name] = 'Changed_Name'
    WHERE [id] > 102
    GO
    
  4. Now let’s perform the second execution of the continuous auditing job so that these updates are read from the transaction log file and added to our auditing tables in the repository database
  5. If we query our repository database with the above described SELECT, we can confirm that the two updates we’ve just performed with our script from the step 3 have been added to the repository database and confirm that only they have been included in the second auditing process since no data has been duplicated in the central repository database

Continuous auditing

Unlike a simple task of reading the transaction log and exporting data, in this article, we want to go one step further, and run the same job nightly to create a continuous audit trail of data with no gaps or duplication

When setting up continual auditing of database operations, it is important to ensure that all operations e.g. INSERT, UPDATE, are captured (no gaps, seams), as well as to eliminate any chance of auditing the same operations more than once (no duplication). To achieve accurate and seamless auditing of a SQL Server database, ApexSQL Log can be used to continuously read transaction log files and to pump auditing results directly to the central repository, or any other database chosen by the user.

Continuous auditing is achieved in a simple way. When the current transaction log auditing job is finished, ApexSQL Log can receive an instruction to remember the last LSN value in the tracking file. Once the next auditing job is initiated, it will simply resume from the last LSN value going forward. This ensures that not even a single operation is excluded from the auditing results nor are any operations duplicate, and that seamless continuous auditing from the transaction log can be fully achieved.

That is achieved by ticking the continuous auditing radio button in the Time range filter setup referenced in Step #5 in this article

Aside from being the core feature for continuous auditing of data directly to a database, continuous auditing option is supported for all ApexSQL Log exporting formats, including before-after auditing and reporting. In addition to the exporting jobs, continuous auditing can also be used when auditing results to grid, which can be particularly useful in cases when user needs to investigate a lot of data, to ensure that no back-tracking or duplicate investigations occur – for those cases, we recommend users to edit tracking files so they can continue where they previously ended without the need to examine the full auditing trail – if you need to stop investigation without completing it, simply remember (copy) the LSN of last operation you’ve examined to the tracking file and replace the last LSN value.

Tracking file

A tracking file which holds information for continuous auditing is updated whenever it is utilized in the auditing process, regardless if it was used in a batch file, in an auditing job or when auditing to GUI. So, if the user needs to perform some unscheduled auditing, or to check auditing results in grid in GUI, it is recommended to take copies of the initial tracking file to use for this purpose to avoid unintentional overwriting of the tracking file. A tracking file enables different users to run continuous auditing jobs and removes the requirement for information on when/where the previous auditing job has finished. Additionally, if multiple auditing jobs are run against the database, multiple tracking files can be used. The ‘last LSN’ value in the tracking file can be edited (via notepad or similar tool) and changed to any existing LSN value. With this, user can easy setup an auditing job to start from the specific LSN value. Also, it is useful to keep in mind that it is possible to use tracking file remotely, and from different workstations, as long as it is not accessed by multiple machines at once.

Job Automation

Now that we know how to setup and run the continuous auditing jobs, let’s take a look on how to automate the job

To completely automate a continuous auditing job, the simplest yet powerful solution is to create a batch file which can be quickly executed on demand, as well as fully automated by running the job on pre-determined schedule via windows schedule or similar tool. Detailed step by step guide on how to automate the batch file execution using the windows scheduler can be found in this article.

While the batch file with instructions/commands for ApexSQL Log can be manually created by combining the most appropriate ApexSQL Log CLI switches, the batch file can be automatically created by ApexSQL Log, while the user can setup all details using GUI, which greatly simplifies batch file creation, especially for files which heavily use various filters to fine-tune the auditing scope and results.

By following the previously described wizard on how to create a continuous auditing job, the ApexSQL Log automatically creates and displays the batch script in accordance to the settings chosen during the wizard in its very last step of the wizard. The batch script can then be manually edited and then directly saved from this step by clicking on the Save batch script button which concludes the batch script creation and can save a lot of time, especially to those that are not familiar with or fond of CLI. Created batch file can finally be scheduled to run nightly by using windows scheduler, SQL job, or any other similar tool which enables running the file unattended on a predetermined schedule.

When running multiple batch files at the same time, it is important to keep in mind that multiple batch files can be run against different databases on one or more SQL Server instances, but only one batch file can be run vs the single database at the specific time frame – other batch files can be run vs the single database after the already running execution finishes.

Downloads

Please download the script(s) associated with this article on our GitHub repository.

Please contact us for any problems or questions with the scripts.

March 7, 2016