Editorials

Join the Union

When working with sets there are a lot of ways to transform them using the SQL language. One of the things we often do with sets is to combine two or more into a single set. Perhaps you have a query based on some value, and another query based on a completely other value. Now you want to get the results of both queries, and only include a single record for those records found in either set. In another scenario, you are sure that there will not be any duplicate records found in both queries. What are your best options in these different scenarios?

In SQL Server you will find two different operations to process two sets. The UNION operation will join the two different sets, and only return one record if it is found in both sets. By contrast, the UNION ALL operation returns every row from both sets, without excluding duplicates.

Because UNION ALL does not check for and remove any duplicate records it is a faster operation. It should be preferred over UNION if you are sure there are no duplicate records. Again, if you don’t care about having duplicate records, then use the UNION ALL operation for performance reasons.

If you must have unique records, and are not sure if a record may be found in both sets, you can still use UNION ALL with DISTINCT or GROUP BY. However, it is simpler to use the UNION operation instead. Unless you are doing other things with your GROUP BY, then UNION is the operation for you.

Sometimes it is better to use OR in a single query rather than having two separate queries and performing a UNION. If you are using different data points to return your data, it may be better to UNION two separate queries. This is especially true if the logic results in using different indexes. Since a single query may only use a single index, separating your query into more than one allows for the utilization of the best index for each scenario. Technically, SQL Server may actually do the same thing under the covers. But you won’t be aware of it unless you review the query plan. If you make the separation, then the engine will be more likely to follow a multiple index implementation.

Join the UNION (I just had to say that) when it helps your queries perform better.

Cheers,

Ben