Editorials

I Need Faster Updates

The default settings in SQL Server are usually quite performant. Just like many other data engines, there are things you can do to optimize the performance specific to your unique requirements. Often, the things you can do are based on physics. As the persistence engine matures, the number of things you can do is decreasing.

Back to the comment about physics: The first thing you need to do with your version of SQL Server is to understand the characteristics of how the data is physically stored in a table. Let me give an example of why this is important. Frequently, data in a table is not immutable. We perform updates on records as things change over time. A purchase order might be a good example, where it may have a state column. As the purchase order state changes, the field is update on the record.

When updating a record in a table there are two different kinds of updates you can make, and the method is determined by the table itself. The key question the SQL Engine must determine is, can the current record in the database be updated in place. After updating the record with the new values, will it fit within the same space already available where the record currently exists? If the answer is no, then the Engine deletes the existing row, and adds a new row with the merged contents. If the answer is yes, the existing data is updated in place. An in place update is faster than a delete/insert technique.

Different versions of SQL Server have different requirements of in place updates. You need to see the document for the version you are using in order to determine if you are getting optimum updates.

A few simple things optimize performance when it comes to updates. First, don’t allow null values, at least on older versions of SQL Server. If you don’t allow variable length data types, in place updates occur more often. However, you waste space on the disk, and have to constantly trim trailing spaces on string values. Finally, don’t do updates for data that has point in time properties. Instead of having a status column, have a purchase order status table.

When the status of a purchase order changes, add a new status record to the purchase order status table. If you need to optimize performance on the status table so you know what is the current status, have an IsCurrent field. When you insert your new record, update the status of the old record to false. Yes, you are still doing an update, but you are doing it in a table assured to be able to update in place. The added benefit, is now you have a status history.

Another option is to put columns that are updated frequently into separate tables, using a one to one relationship. This way you can optimize the tables that will be updated to update in place.

There are diminishing returns as you normalize more fully. At some point, the overhead to bring a complete purchase order back together has a higher cost than that of the update. So, if the data is pretty static, you may have optimized insert, update and delete transactions at the cost of slowing down read requests. Your design should be based on the kinds of transactions occurring most often.

Do you know what requirements your database has for in place updates? Maybe it’s time to find out.

Cheers,

Ben