Editorials

Can we Agree on Keys?

I had forgotten how much polarization there is regarding the use of System Assigned or Natural Keys. It seems to be an issue were we tend to have VERY strong opinions, mostly based on bad experiences we don’t want to re-visit. If we had a bad experience using Natural keys we lean toward system assigned keys. Those with issues caused by system assigned keys have the opposite opinion.

Just a note, I’m impressed with the comments made up to this point. You all have shown great strength of opinion without degrading the thoughts of others. I’m impressed.

Let’s see if I can bring some areas where we can all agree. I have a system, where I can’t reveal the actual schema, so I’ll make up something where we can all be on the same page.

CREATE TABLE Cluster (

ClusterID INT NOT NULL IDENTITY(1,1)

,CONSTRAINT PK_Cluster PRIMARY KEY CLUSTERED (ClusterID)

)


CREATE TABLE ClusterGroup (

ClusterGroupId INT NOT NULL IDENTITY(1,1)

,ClusterID INT NOT NULL

,ClusterGroupNumber INT NOT NULL

,MaxItemsAllowed INT NOT NULL

,CONSTRAINT PK_ClusterGroup PRIMARY KEY CLUSTERED (ClusterGroupId)

,CONSTRAINT FK_ClusterGroup_Cluster

FOREIGN KEY (ClusterId)

REFERENCES Cluster (ClusterId)

)

CREATE TABLE ClusterGroupItem (

ClusterGroupItemId INT NOT NULL IDENTITY(1,1)

,ClusterGroupId INT NOT NULL

,ItemNumber INT NOT NULL

,CONSTRAINT PK_ClusterGroupItem PRIMARY KEY CLUSTERED (ClusterGroupItemId)

,CONSTRAINT FK_ClusterGroupItem_ClusterGroup

FOREIGN KEY (ClusterGroupId)

REFERENCES ClusterGroup (ClusterGroupId)

)

We now have a relational hierarchy of three tables. Clusters contain Cluster Groups. Cluster Groups contain Cluster Group Items.

My Microsoft friends will be happy because I used a sequential clustered primary key. But there are two problems that must be resolved. A cluster may have as many Cluster Groups as needed. However, within a cluster, the may only be a single record with the same ClusterGroupNumber. A Cluster Group may have many Cluster Group Items. However, within a Cluster Group, each ItemNumber in the Cluster Group Items must be unique.

To implement both of those rules I create unique indexes on those tables.

CREATE UNIQUE NON CLUSTERED INDEX AK_ClusterGroup

ON ClusterGroup (ClusterId, ClusterGroupNumber)

CREATE UNIQUE NON CLUSTERED INDEX AK_ClusterGroupItem

ON ClusterGroupItem (ClusterGroupId, ItemNumber)

I think given the little information I may provide here, most individuals would agree that this solves all of the business rules. Without the unique indexes you can have invalid data. The use of Identity columns fulfill referential integrity.

For argument, there is nothing preventing me from inverting this design. I may still create the ID columns in each table. However, I can have the unique key (the one I enforced with a unique index) be the primary key instead. Again, I don’t think anyone would disagree with me that this is also an acceptable design.

What I find is that many would disagree that this modification will perform well. In my experience, using the natural key out-performs the sequential version if your table is configured for natural keys. Tomorrow I’ll present my reasoning for why things that have resulted in a positive experience performance wise using natural keys. I’m pretty sure it won’t convince many of you. But, it may cause you to consider testing for yourself. I will give you one clue today; using natural keys for this design may perform differently based on your database usage patterns. If you have more writes than reads, your experience may vary. By the way, this even works with Entity Framework 6.

Cheers,

Ben