Editorials

Use SQL Mocks for SQL Unit Testing

Here’s a technique you may find interesting when working with SQL queries. One of the things I like to do is to create Unit Tests for my SQL procedures or functions. Sometimes this is more difficult to do in SQL because the majority of your work is based on sets. The sets come from your database. Therefore, if anything changes the data you are using for your unit tests, then your tests may fail, not because of your logic, but because the underlying data was modified.

When we create unit tests in object oriented languages we have a lot of help when it comes to creating sample data (MOCK) against which we can exercise our logic. However, there is little help when it comes to SQL databases out of the box.

One method I found useful was to build and maintain a MOCK database. The database migrations could be applied against the MOCK database so that everything is synchronized between the application and the database schema and database logic. The big difference is that the MOCK database data consists of lookup data maintained by database migrations, and then sample data created for specific tests in the instance tables. This is great for unit tests, and does not require exercising data from a production database.

A second method I found that works in some situations is the use of a table value function. Using a table value function, I can create a virtual table to be used when testing something else requiring a set, without having to actually have a table, or tables supporting my needs. My test data is not materialized until the test is executed. Here’s a simple example…

First I’ll create a table called Users and populate it with some of the early presidents of the USA.

CREATE TABLE Users
	(UserID INT NOT NULL
	, FirstName VARCHAR(32) NOT NULL
	, LastName VARCHAR(32) NOT NULL)
GO

INSERT INTO Users (UserId, FirstName, LastName) VALUES
	(1, 'George', 'Washington')
   ,(2, 'Thomas', 'Jefferson')
   ,(3, 'James', 'Madison')
   ,(4, 'James', 'Monroe')
   ,(6, 'John Quincy', 'Adams')
   ,(5, 'John', 'Adams')
GO

Next I’ll create a table function returning a completely different set of individuals, with the same schema.

CREATE FUNCTION dbo.MOCK_GetUserList()
	RETURNS @UserList TABLE 
	(UserID INT NOT NULL
	, FirstName VARCHAR(32) NOT NULL
	, LastName VARCHAR(32) NOT NULL)
AS
BEGIN
	INSERT INTO @UserList (UserId, FirstName, LastName) VALUES
	(1, 'Paul', 'Bunion')
   ,(2, 'Benjamin', 'Franklin')
   ,(3, 'Mortimer', 'Schnerd')

   RETURN
END
GO

Now I implement my stored procedure. It returns data from either table depending on a debug flag sent into the stored procedure. Data from appropriate set of users is first captured. Then the procedure logic is applied. I have two tests where I search the last name for the letter “N” somewhere in the contents. When I run it with the debug flag set I get a different outcome than with it not set, as expected.

CREATE PROCEDURE prsFindUserByLastName (
	@Filter VARCHAR(32) = ''
   ,@Debug CHAR(1) = 'F'
)
AS
	
	SET NOCOUNT ON

	;WITH UserList
	AS
	(
		SELECT	UserId, FirstName, LastName
		FROM	Users
		WHERE	@Debug = 'F'

		UNION ALL 

		SELECT	UserId, FirstName, LastName
		FROM	dbo.MOCK_GetUserList()
		WHERE	@Debug != 'F'
	)

	SELECT	UserId, FirstName, LastName
	FROM	UserList
	WHERE	LastName LIKE '%' + @Filter + '%'

GO

When you execute the procedure with the debug flag the results look like this

EXEC prsFindUserByLastName 'n', 'T'
Mock Data Set
UserId FirstName LastName
1 Paul Bunion
2 Benjamin Franklin
3 Mortimer Schnerd

When you execute the procedure without the debug flag the results look like this

EXEC prsFindUserByLastName 'n', 'F'
Users Table Results
UserId FirstName LastName
1 George Washington
2 Thomas Jefferson
3 James Madison
4 James Monroe

Obviously, this isn’t a good solution when the sets of data may be large. You wouldn’t want to copy a million rows into a temporary table so that your filter is applied only once. In that case, you would most likely find a technique where you could apply the filter prior to returning the data to the calling stored procedure.

A better approach would be to have two different functions returning user data, and have one be attached to a MOCK schema. The real function you wish to use in production would be attached to a different schema. Then you would not need the Debug flag or the UNION all in your stored procedure. You would simply run your query pointing to the function. If the connection running the test is a member of the MOCK schema, it would use the MOCK version. Otherwise, it would use the database version, allowing you to separate what happens though a more traditional, and reliable mechanism.

Tomorrow I’ll demonstrate an implementation using a Schema instead.

There’s some food for thought. Feel free to leave any questions you may have in the comments.

Cheers,

Ben