How to enable and use SQL Server Change Data Capture

In the previous article, SQL Server Change Data Capture (CDC) – Introduction, we described the main characteristics of the SQL Server feature for tracking data inserts, deletes and updates – Change Data Capture. We also compared it to another SQL Server auditing feature – SQL Server Change Tracking

In this article, we’ll show how to enable and use the SQL Server Change Data Capture feature

How to set up SQL Server Change Data Capture?

The feature is available only in SQL Server Enterprise and Developer editions, starting with. It can be enabled only using system stored procedures. SQL Server Management Studio provides a wide range of code templates for various feature related actions

To open the templates:

  1. In the SQL Server Management Studio menu, open View
  2. Click Template Explorer
  3. Open SQL Server Templates
  4. Open the Change Data Capture sub-folder. The T-SQL templates for administration, configuration, enumeration and meta data querying are available

    Change Data Capture sub-folder in SQL Server Management Studio

To set up the feature:

  1. Make sure SQL Server Agent is running. If not, right-click it in Object Explorer and click Start
  2. To enable the feature on the database, open the Enable Database for CDC template in the Configuration sub-folder, and replace the database name with the name of the database you want to track
    USE AdventureWorks2012
        GO

    EXEC sys.sp_cdc_enable_db
        GO

    The login used must have SQL Server sysadmin privileges and must be a db_owner of the database. Otherwise, you’ll get the following error

    Msg 22830, Level 16, State 1, Procedure sp_cdc_enable_db_internal, Line 193
    The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’. The error returned was 15517: ‘Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission.’. Use the action and error to determine the cause of the failure and resubmit the request.

    One of the ways to fix it is to change the database owner to ‘sa’ and execute sys.sp_cdc_enable_db again

    EXEC sp_changedbowner 'sa'
        GO

    EXEC sys.sp_cdc_enable_db
        GO

  3. To check whether Change Data Capture is enabled or disabled for a specific database, query the sys.databases view. The is_cdc_enabled column value 0 indicates that the feature is disabled, otherwise it’s 1
    SELECT name, is_cdc_enabled
        FROM sys.databases

    0 indicates that the CDC feature is disabled, otherwise it’s 1

    After the feature is enabled on the database, the cdc schema, cdc user, data capture metadata tables are automatically created

    After the CDC is enabled, the cdc schema, cdc user, data capture metadata tables are automatically created

  4. You have to enable the feature for each table you want to track
    EXEC sys.sp_cdc_enable_table
        @source_schema = N'Person',
        @source_name   = N'Address',
        @role_name     = NULL,
        @supports_net_changes = 1
        GO

    When the feature on the table is successfully enabled, the following messages are shown

    Job ‘cdc.AdventureWorks2012_capture’ started successfully.
    Job ‘cdc.AdventureWorks2012_cleanup’ started successfully.

    There are four parameters available: @captured_column_list, @filegroup_name, @role_name, and @supports_net_changes

    By default, all columns in the table are tracked. If you want to track only the specific ones, use the @captured_column_list parameter. The syntax is

    @captured_column_list = N'AddressLine1, AddressLine2, City'

    The @filegroup_name parameter can be used to change the default location of the change tables, for example

    @filegroup_name = N'SECONDARY'

    “By default, the change table is located in the default filegroup of the database. Database owners who want to control the placement of individual change tables can use the@filegroup_name parameter to specify a particular filegroup for the change table associated with the capture instance. The named filegroup must already exist. Generally, it is recommended that change tables be placed in a filegroup separate from source tables.”[1]

    The template for changing the default filegroup is Enable a Table Specifying Filegroup Option in the Configuration sub-folder

    By default, all members of the sysadmin and db_owner roles have full access the captured records. To limit access to the captured change data, create a new role that provides necessary permissions on the captured information, and use the @role_name parameter to grant the permissions only to the role members

    When the @role_name parameter is set to NULL, only members of sysadmin and db_owner roles have full access to captured information. When set to a specific role, only the members of the role (called a gating role) can access the changed data table. The template for assigning a specified role is Enable a Table Without Using a Gating Role

    @role_name = N'cdc_Admin'

    The @supports_net_changes parameter enables to show multiple changes aggregated as a single one. This parameter can be used only on tables that have a primary key or unique index

    For example, if the row was first inserted (_$operation = 2), and deleted (_$operation = 1) afterwards, the net change will be that nothing has happened. The feature captures both transactions, but the @supports_net_changes parameter enables to see both individual and net changes

    If the row was first inserted, and deleted afterwards, the net change will be that nothing has happened

    The template for setting the @supports_net_changes parameter is Enable a Table for All and Net Changes Queries

    @supports_net_changes = 1

  5. To check whether Change Data Capture is enabled on the table
    SELECT name, is_tracked_by_cdc
        FROM sys.tables
        where name = 'Address'

    If a table is tracked, 1 is returned, 0 otherwise

    When the feature is enabled on the table, a capture table and up to two query functions are automatically created. For the Person.Address table, these are the cdc.Person_Address_CT table, and cdc.fn_cdc_get_all_changes_Person_Address and cdc.fn_cdc_get_net_changes_Person_Address table valued functions. The latter one is created only when the @supports_net_changes parameter is set to 1. These functions are used to query change tables

The change table structure

The first five columns in the change table store specific transaction information – the start and end log sequence number, the operation type (delete, insert, update), the ID of the column updated, and IDs of the columns affected by updates. The rest of the columns are identical as in the source table and store the captured information

Whenever a row is inserted or deleted in the source table, a new row identical to the inserted or deleted one is added to the change table. When a row is updated, 2 rows are inserted. The first one is identical to the row before the update, and the second one to the row after the update

With time, the change tables grow. To maintain their size and keep those from growing uncontrollably, the cdc.<database_name>_cleanup job is used

If the structure of the source table is modified, the structure of the change table is updated accordingly

The Change Data Capture jobs

When the Change Data Capture feature is enabled for the first table in the database, two SQL Server jobs are automatically created – one to capture the changes and another for cleaning up the old captured information

      Two SQL Server jobs are automatically created

The capture job is in charge of capturing data changes and processing them into change tables

“It runs continuously, processing a maximum of 1000 transactions per scan cycle with a wait of 5 seconds between cycles. The cleanup job runs daily at 2 A.M. It retains change table entries for 4320 minutes or 3 days, removing a maximum of 5000 entries with a single delete statement.” [2]

Like other SQL Server jobs, the capture job can be stopped and started. When the job is stopped, the online transaction log is not scanned for the changes, and changes are not added to the capture tables. The change capturing process is not broken, as the changes will be processed once the job is started again. As the feature that can cause a delay in log truncating, the un-scanned transactions will not be overwritten, unless the feature is disabled on the database. However, the capture job should be stopped only when necessary, such as in peak hours when scanning logs can add load, and restarted afterwards

Change Data Capture can be enabled only using code, as SQL Server Management Studio offers no options for the feature. It has to be enabled for each table individually. For each tracked table, a new system table and up to two functions are created, which brings additional load to the database. Although it captures more information about transactions than SQL Server Change Tracking, it doesn’t answer the “who”, “when”, and “how” questions

In the next part, How to analyze and read Change Data Capture (CDC) records, we’ll show how to read the captured information

All articles in this series:

  1. What is SQL Server Change Tracking and how to set it up?
  2. How to read SQL Server Change Tracking results
  3. SQL Server Change Data Capture (CDC) – Introduction
  4. How to enable and use SQL Server Change Data Capture
  5. How to analyze and read Change Data Capture (CDC) records
  6. SQL Server Audit feature – Introduction
  7. How to set up and use SQL Server Audit
  8. How to analyze and read SQL Server Audit information
  9. Auditing SQL Server data changes – the centralized solution

References:

[1] MSDN – Enable and Disable Change Data Capture (SQL Server)
[2] MSDN – About Change Data Capture (SQL Server)

Useful resources:

MSDN – Overview of Change Data Capture
MSDN – Change Data Capture and Other SQL Server Features

November 4, 2013