Editorials

Query Optimization

When it comes to database performance optimization, index management is high on the list if issues you need to address. In fact, I think this is only second to an efficient database design. Sure, there are other factors that make significant impact for a database performance. But, if these two are off, you can throw a lot of money at the problem, and still not get to where you need to be. This is true for any SQL data engine.

Too many indexes reduce performance when manipulating data. Too few indexes slow down queries. So, are there any guidelines you can use when managing indexes. The simplest guideline is to start with indexes supporting relationships. When a parent table joins to a child table, you need to have an index on the foreign key column in the child table. When a child table references its parent, there needs to be an index on the parent table primary key, found in the child table foreign key table. This is probably already in place if you use referential integrity,. Creating a primary key constraint on the parent table, by default, creates an index on the primary key column(s).

So, if you find the data usage patterns always work from the child to the parent, there is not necessarily a need for an index on the foreign key column of the child table. If you join from the parent to the child, then, by all means, create an index on the child table.

If child data is always, and only, selected from the parent table and its relationship, consider making the foreign key a part of the clustered index on the child table. That way, the join from the parent does not find records in a non-clustered index, and then have to look up the data under the clustered index on the child table. This actually optimizes joins on the child table because the records are all sequential according to the parent record.

For example, if you had a sales parent table, and sales detail child table with a foreign key of SalesID from the sales table, consider when you create an index for the child table to make it Clustered on SalesID. This design will group all your details by SalesID, and enhance your query performance. I know this is in contravention of the MS recommendation of having an ID column on every table that is a sequential value assigned by an Identity, and should be the clustered index. You can combine those two techniques. Make a clustered index of SalesID, ID. It won’t be your primary key (probably just ID). But you now have both worlds.

Many ORMs support composite primary and foreign keys. As a result, I have been moving away from the single IDENTITY value as a Primary Key. It doesn’t make things like Entity Framework, or your Data Models, any easier, and requires an additional index to enforce the natural key of the data.

Well, that gives you something to chew on. Perhaps you don’t care for this advice? If not, feel free to leave a comment with your concern, and better yet your solution. Email is always welcome too, at btaylor@sswug.org.

Cheers,

Ben