Editorials

Set Comparison with XML Diff and Patch

One of the things making it difficult to automate testing of SQL Code is that most test frameworks are based on scalar data, not sets. I’ve spent hours rolling my own data table comparison tool based on ADO.Net data tables. It took forever to write and perfect. I had to add a number of features to the library as I ran into more and more edge cases. Here are a few of the things I had to implement.

  • Accept a list of Columns to Ignore in the comparison
  • Handle different column ordinal positions
  • Compare based on data type, not always converting ToString()
  • Filter the set
  • Sort the Set
  • List rows from the baseline not included in test results
  • List rows from test results not included in the baseline
  • List rows with variance
  • Provide summary of pass/fail
  • Provide detail of variance

Today I came across a tool that’s been available for a long time from Microsoft. It is an XML comparison tool that has many of the same features I listed above. Because it is based on XML it can be used with many different kinds of data. Because an ADO.Net data table can be converted easily to XML, this tool may be used out of the box with a lot of power. The tool is called the XMLL Diff and Patch tool.

The XML Diff and Patch tool has been available since 2002. It has the ability to do a diff at different levels you may define. You can tell it to ignore things like attributes or nodes being in a different order. There are a lot more features that you can include or exclude different comparisons. The Patch aspect allows you to even merge the two documents into a new one.

Take a look if you find yourself comparing a lot of sets. It might be able to save you some time.

Cheers,

Ben