How to mask SQL Server data using Dynamic data masking

In the world of today, information can be classified into sensitive and non-sensitive data. The sensitive data is classified as data that contains private, personally identifiable, financial, and security information, whose discovery can be misused and damage the reputation of a person or an organization.

Since sensitive data can cause damage or can be misused, both big corporations and small firms are looking to protect that data. For example, in the banking system, a payment card number stored in the database presents sensitive data that can be misused and that data must be protected from unauthorized use. One aspect of protecting sensitive data is insuring who can see sensitive data according to predefined security levels.

Dynamic data masking feature that Microsoft implemented in SQL Server 2016 version, allows users to mask SQL Server data based on predefined security levels. When dynamic data masking is used, sensitive data will be hidden from non-privileged users, i.e. non-privileged users will be served masked data, while privileged users (users with a higher level of security) will have access to real sensitive data. The good side of this feature is that it only hides sensitive data and does not change the actual data in the database.

In this article, two ways of using the Dynamic data masking will be explained:

  • By T-SQL code in SQL Server Management Studio
  • By ApexSQL Mask

Mask SQL Server data using Dynamic data masking feature in SQL Server

In SQL Server 2016 and newer versions, Microsoft offers some pre-defined types of masks that will use to specify the Dynamic data masking. The Dynamic data masking types are:

  • Default data masking
  • Random data masking
  • Email data masking
  • Custom data masking

To explain and demonstrate each type of dynamic data masking, the table dbo.Customer will be created in the sample AdventureWorks2019 database:

USE [AdventureWorks2019]
GO

CREATE TABLE Customer (
Customer_ID INT IDENTITY(1, 1) PRIMARY KEY
,First_Name NVARCHAR(10) NOT NULL
,Last_Name NVARCHAR(10) NOT NULL
,Date_Of_Birth DATETIME NULL
,Montly_bill INT NULL
,Email NVARCHAR(50) NULL
,Credit_Card_Number NVARCHAR(25) NOT NULL
)

That newly created table will be populated with sample sensitive data, and then will be masked using different types of Dynamic data masking. Table dbo.Customer will be populated with 10 new test records with the help of ApexSQL Generate, test data generator tool:

The Customer table which sensitive data will be masked

When the table is created and populated with test data, the next step is to create a new user with the name “Test”, which will have read-only access over the newly created table:

CREATE USER Test WITHOUT LOGIN;
GRANT SELECT ON Customer to Test;

If the newly created read-only user queries data from the dbo.Customer table:

USE AdventureWorks2019
GO
EXECUTE AS USER = 'Test';
SELECT * FROM Customer;
REVERT;

All sensitive data from the table will be presented to the Test user, which could potentially be a security breach depending on the level of authority. However, by using dynamic data masking in SQL Server, the authorized persons can mask data presented to the non-privileged users and in this way prevent a security break.

With which predefined type of dynamic data masking the sensitive data will be masked depends on the data type of the column to be masked. In the next part, the article will explain previously mentioned predefined types of dynamic data masking in detail.

Mask SQL Server data with Default type

The Default type will mask complete values in the selected column, and how the columns will be masked depends on the data type of the specified column:

  • String data types (char, nchar, varchar, nvarchar, text, ntext) – values in a column with this data type will be masked with XXXX or less X if the values in the column are smaller then 4 character
  • Numeric data types (bigint, bit, decimal, money, int, smallint, tinyint, smallmoney, real, float) – zero value will be used for these columns with numeric data type
  • Date and time data type (date, datetime, datetime2, datetimeoffset, time, smalldatetime) – 01.01.1900 00:00:00.0000000 will be used to mask SQL Server data in these columns
  • Binary data types (binary, image, varbinary) – single bytes of ASCII value 0 will be used for masking values in the columns with these data types

A column with the string data type will be masked for this example. To mask values in the specified column with the Default type of Dynamic data masking, ALTER TABLE…ALTER COLUMN T-SQL statement will be used:

ALTER TABLE dbo.Customer
ALTER COLUMN First_Name NVARCHAR(10) MASKED WITH (FUNCTION = 'default()');

When the Test user tries to retrieve customer data with T-SQL statement:

EXECUTE AS USER = 'Test';
SELECT * FROM Customer;
REVERT;

This time, the First_Name column will be masked with xxxx as shown below:

The First_Name column in Customer table

As the Test user sees more sensitive data in the Customer table, in addition to the previously masked First_Name column, the other three types of Dynamic data masking mentioned previously will be used to mask other sensitive data in the table.

Mask SQL Server data with Random type

The Random type of Dynamic data masking is used on any numeric data type. The original value will be masked with a random value within a specific range.

In the Customer table, values from the Montly_bill column will be masked with the random masking function. Like in the previous type of mask, and in other three, to mask SQL Server data ALTER TABLE…ALTER COLUMN T-SQL statement will be used for masking:

ALTER TABLE dbo.Customer
ALTER COLUMN Montly_bill INT MASKED WITH (FUNCTION = 'random(3,9)');

From the T-SQL statement for Random type of dynamic data masking, it can be noticed that the values from the Montly_bill column are masked with values ranging from 3 to 9. When the Test user fetches data from the Customer table, the table will be as follows:

The Montly_bill column in Customer table

Mask SQL Server data with Email type

To protect email data from a security breach, the dynamic data masking feature offers the Email type for masking sensitive data. The Email function of dynamic data masking shows the first letter of an email address and suffix .com, and the form of the Email method of masking is looked like qXXX@XXXX.com.

When the Email method of masking is in question, ALTER TABLE…ALTER COLUMN T-SQL statement will look like this:

ALTER TABLE dbo.Customer
ALTER COLUMN Email NVARCHAR(50) MASKED WITH (FUNCTION = 'email()')

The Email column from the Customer table will be used for the example. If the Test user tries to query data from the mentioned table, the following result set of the table will be displayed:

The Email column in Customer table

Mask SQL Server data with Custom type

Besides already shown masking methods, Microsoft also offers a Custom type of dynamic data masking. This type is the masked method which reveals the first and last values, but adds a custom padding string between them. This masking function allows choosing how many first and last values will be exposed, and what string will be in the middle.

The Credit_Card_Number column from the Customer table will be masked with this masked function, the first 2 and last 2 letters will be exposed with x-xxx-xx custom string in the middle. T-SQL statement in the Custom masking method is below:

ALTER TABLE dbo.Customer
ALTER COLUMN Credit_Card_Number NVARCHAR(25) MASKED WITH (FUNCTION = 'partial(2,"x-xxx-xx",2)')

The Customer table will get the following appearance if the Test user tries to query the table once again:

The Credit_Card_Number column in Customer table

Unmask SQL Server data with Dynamic data masking

One of the advantages of Dynamic data masking is the fact that this feature only hides sensitive data. If users are granted UNMASK access, they will be able to see that masked data:

GRANT UNMASK TO Test
GO
EXECUTE AS USER = 'Test';
SELECT * FROM Customer;
REVERT;

That action will permit the Test user to see all sensitive data as can be seen below:

The values in the customer table after unmask SQL Server data

Mask SQL Server data in ApexSQL Mask

ApexSQL Mask is a stand-alone and easy to use application with the primary purpose to mask sensitive SQL Server data. The sensitive data is protected with static and dynamic data masks. In this article, the dynamic data masking will be explained and shown how to use it.

When the ApexSQL Mask is connected to the desired database, in this case, the AdventureWorks2019 database, the main window will appear. Here, columns will be chosen following the examples in the previous section:

The main grid of ApexSQL Mask, in which will be selecting columns to mask SQL Server data

When columns are selected, under the Home tab, click on the Manage button:

The Manage button

The Manage masks window will be opened and, by default, the Replace original data tab will be selected. The next step is to click on the Hide original data tab:

The Hide original tab, where will create the new masks to mask SQL Server data

In the Hide original data tab, the first step is to choose users for which the sensitive data will be hidden when applied to the newly created mask. By clicking on the User permission button, a window will be opened as shown below, in which the Test user will be chosen. When a user is selected, click on the OK button:

The User permission window

To create a new mask and to add it to selected columns, under the Hide original data tab, click on the New button. The Create new mask window will be shown:

Create a new mask window, where will be created a new mask to mask SQL Server data

In the Create new mask window, it is necessary to enter the name of the mask, while the description of the mask is optional. When all is set, the next step is to choose the type of dynamic data masking that will be used to mask SQL Server data.

By clicking on the Type drop-down list four types will be shown:

  • Default
  • Random Numeric
  • Custom text
  • Email

For this article, the Default type will be chosen to mask SQL Server data. When all of the above is set, the next step is to click on the Apply button:

The Create new mask window

The Apply to columns window will be opened, where the same column that is used in the previous example is selected, First_Name from the dbo.Customer table. When the column is selected, click on the OK button:

The Apply to columns window

The Create new mask window will appear again and it can be closed by clicking on the OK button. Then, the newly created mask will be shown in the Manage masks window and by clicking on the OK button it will switch the view on the main window of the application. From there, the results of masking can be seen in the Preview window:

The Preview window, swhere the difference between actual and mask SQL Server data will be shown

Conclusion

Dynamic data masking (DDM) is a great way of limiting sensitive data exposure by masking it to non-privileged users as we’ve seen in this article. ApexSQL Mask is a data classification and masking tool that simplifies the work of using complicated scripts with predefined masks, easier access through the graphical user interface, as well as the ability to save masks and to use in other databases.

 

November 2, 2020