Hands free, no-coding SQL Server database replication of a reporting database

ApexSQL Log is an amazing tool for reading the transaction log and reading the transaction log can form the basis of a lot of great features, functionality. In this case, we’ll look at how you can use ApexSQL Log for transactional replication to keep a reporting database up to date with your production system.

Reporting server/database

Many times there will be a business requirement for a means to offload work against the production server from reporting requests. Not only can this reduce the load on the production server and potentially improve performance, creating a separate environment, optimized for reporting, can also speed the retrieval of reports and queries, in some cases significantly.

In essence, creating a reporting database mean taking a copy of the production database and moving it to a different server. This will reduce the load on production, as mentioned. But there can be additional benefits including

  1. Moving the reporting server “closer” to reporting clients to reduce network traffic
  2. Adding indexes to optimize select queries to improve performance, without having to worry about degrading the performance of transactional operations
  3. Faster reporting because there are no other transactions or operations taking place, besides reporting, on the new server

Ok, this sounds great, but how do we automatically and reliably sync our reporting and production systems so that we keep our reports timely and accurate to within a certain window of time e.g. 15 minutes?

Transactional replication

This is a great use case for transactional replication, where transactions from the production database can be replicated to the reporting server. As such, this would seem to be a good fit for SQL Server transactional replication. There are a few reasons why you might not want to go that route though …

  • You don’t have the Enterprise edition of SQL Server or SQL Server 2016, where transactional replication is included in other editions. See Editions and supported features of SQL Server 2016
  • You don’t have the time or patience to learn how to implement transactional replication and would like an out-of-the-box solution
  • The servers can’t see or talk to each other directly, so you need to move the transaction store, upload it etc and then download, consume it in a different environment
  • You already own ApexSQL Log and use it for other purposes e.g. Continuous auditing, disaster recovery, forensic auditing, simulating production loads etc

Why not use Change tracking or CDC?

Both of these solutions offer potential alternatives to transactional replication but come with the same edition restrictions but also can require a lot of custom coding, which can quickly increase the time required and cost of your replication solution. Although these are native solutions, they certainly aren’t hands-free or out-of-the-box, so let’s keep looking

Transactional replication with ApexSQL Log

That brings us back to ApexSQL Log. ApexSQL Log is uniquely suited for just this task.

  • It can read the SQL Server transaction log quickly and easily and produce accurate results
  • It has a console application with a full command line interface that allows for extensive, customizable automation
  • And it can easily set break points to determine where the last job ended so there is never duplicate data or gaps in data
  • Most of the configuration can be done with an easy to use GUI and saved as a project
  • The entire implementation, including the reference to the aforementioned project configuration file can be converted to a batch file, with no coding required!

First, the high level view

The concept here is pretty simple. We will set up ApexSQL Log to poll a “Publisher” database. This will be the production database, in our case, that we want to replication transactions from. ApexSQL Log will wake up every 15 minutes, read the transaction log, and write all of the transactions to file as a T-SQL “Redo” script. This script will replay all of the transactions, later, on the “Subscriber” database.

We’ll transport, upload etc the script file as needed, then schedule a simple process to open the file and execute it on the Subscriber database. This can be as simple as a PowerShell script or a batch file.

Variations

With ApexSQL Log we can easily set up sophisticated filters to replicate only the data we need and leave everything else out. For example we can easily filter by operation, table name, user or even add some advanced filters including transaction state and duration, specific table field values and more

We can poll at larger e.g. 1 hour or shorter e.g. 1 minute intervals

If ApexSQL Log can see both servers simultaneously, it can read from the Publisher transaction log and write to the Subscriber directly skipping the step of creating a file, then opening and executing it. For simplicity, this is what we’ll illustrate with our example.

Getting our hands dirty

Ok, let’s build this

First we’ll open ApexSQL Log and connect to the Publisher database, WorldWideExporters.

We’ll configure ApexSQL Log to use the Online log as we’ll make sure the database is in Full recovery mode, ensuring all of the transactions in our 15-minute window will be found there

Next, we will choose to create an Undo/Redo script

Next, let’s set up a filter to determine just what transactions we are going to include and/or exclude. For instance, we can opt to exclude several tables from our job by unchecking them in the ‘Tables’ filter. We can do the same with users or any other filters to achieve high auditing precision

Also, we will only include DML operations and exclude all DDL operations

Now let’s set the options for continuous auditing, automatically including breakpoints so we don’t have any gaps or overlaps that might cause missing or duplicate transactions respectively. ApexSQL Log will always remember the LSN value of the last audited operation and when the time for the next job is due, mentioned LSN value will be used as a starting point ensuring that no duplicate transactions are included in this continuous auditing task as well as no transactions are overlooked or missed.

Once we are done, let’s save our project file as “Replication.axlp” so we can come back to this again if we have any edits, allowing us to quickly update our reporting profile. In fact, we can edit this file and save over the last version to update our configuration settings without even having to update our automation script.

Now that we have everything configured like we want it, let’s use the ‘Batch file’ feature to save a batch (.BAT) file. This file has Windows Shell script that will run ApexSQL Log and then execute it with this saved project without running the GUI or requiring any user actions.

Note: We’ll run this on a Windows client that uses integrated security to connect to both our Publisher and Subscriber databases, so we won’t need to worry about user names or passwords.

Now all we have to do is schedule the batch file with Windows scheduler, SQL Job or some other way. Once scheduled, this job will “wake up” ApexSQL log at a particular interval e.g. 15 minutes and execute it with the instructions included in the batch file and the ApexSQL Log project file and create continuous Redo scripts.

Note that the job to execute the scripts on the “Subscriber” database should also be scheduled via similar/same means in order to finalize our replication job.

See it run

Once I have created my Project file and batch file, for automation, then scheduled it to run every 15 minutes, I’ll make some changes in production.

I’ll run a script to insert 10 rows into [dbo.Employees] table.

Now let’s run a quick query to compare our Publisher and Subscriber databases. Sure enough, you can see that the 10 extra rows exist in the Publisher but not the Subscriber.

Let’s now click our BAT file, because we don’t want to wait 15 minutes, and let it run. Once it closes, let’s re-compare the two databases. We can see that they are now equal as the 10 rows have been replicated to the Subscriber. We can see that both tables now have the same row count of [ 20 ].

Here is what the Redo script looks like in ApexSQL Log built-in editor

Once the script is executed on the “Subscriber” database, data in the “Employees” table will be perfectly replicated

Some enhancements

If we want to dial up the data quality of our reporting database to make sure it is closer to real time, we can change our scheduling interval from 15 minutes to 1 minute.

If we wanted to include DDL operations too, that would make sure we automatically update our Subscriber with any schema changes from the Publisher. So let’s open ApexSQL Log, open our project file, and then add “DDL operations” then hit save.

Additional enhancements

Although our example system is pretty simple and basic – it works, and it should work consistently and reliably. That doesn’t mean we can’t augment it with some cool features including auditing and alerting, to make sure that our reporting database is being replicated properly

Here are some suggestions

  1. Saving T-SQL replay script to file with a date stamp so we have an audit of transactions
  2. Saving date stamped verbose console output to file as additional auditing mechanism
  3. Running the T-SQL script in a separate process, trapping a return code and emailing an alert if there were any errors with the T-SQL script
  4. Trapping an error code from our batch file and sending an email alert if it fails
  5. Adding a call to ApexSQL Diff to automatically compare schemas between Production and Subscriber, immediately after a replication push, and email an alert if any differences were found
  6. To prevent any problems with #e, we can add a step *before* a replication push and compare and synchronize the schema of our Subscriber database with the Publisher immediately before the replication push to be 100% certain the schemas are exact before pushing the last batch of data changes. If we do this, then we can uncheck “DDL operations” from ApexSQL Log and not include them in our replication. This might provide a bit more robust means of keeping our database schemas in sync vs just adding “DDL operations” to our replication push (although in theory, they should accomplish the same thing)

See also

For another article on how to perform this same use cases, using similar technologies see the article: How to setup SQL Server database replication for a reporting server

 

August 7, 2018