Editorials

Partition By Time

As the size of your database continues to grow, the physical implementation of your disk storage become more important. This is expressed in real time performance when using the database, and disaster recovery options. One of the key ways of handling large volumes of data, especially in a single table, is the use of partitioned tables. We are going to look at a rolling table scenario using a partitioned table as a way of demonstrating one way to use multiple file groups to optimize your database performance.

First let me remind everyone what a partitioned table is. A partitioned table is a logical table made up of multiple tables having the same schema. The multiple tables are made to appear to uses as a single table through the use of a partition function. When access is made to the partitioned table, the query is routed to the actual physical table where the data is stored through the use of the partition function. In the old days we did this with a partitioned view, but had to manually implement each table’s introduction and extraction from the view. Today, using a partitioned table you simple add and remove tables from the group, and the partition function manages all the rest of the plumbing.

A rolling partition is common in a database where you have data that is created during a specific timeframe, and later dropped when the value of the data is no longer of interest. Some people have a rolling window of one month. Each day a new partition is added to the table, and the oldest partition is removed. Sometimes rolling uses partitions of a week, month, year, etc. The point is that on some frequency you are creating a new partition, and removing an expired partition.

This is where File Groups can have a big value. It starts by having a partition function where new data is always added to the new partition of our partitioned table. So, to start out a new partition, a new file group is created. The new table is created on the new file group. The new table is added to the partitioned table, and the partition function will now point to the new table for all new data.

Here is the value this creates. After a new file group is created, a new table is added into the file group and the partitioned table, everything preceding the new partition is now READ ONLY. That means, you can back up the file group for the preceding partition because it is complete. You can also defragment the data from the preceding partition this one last time, and it won’t need to be touched again. You may even be able to shrink the file size of the preceding partition, although there are some indications that removing file space after defragmenting (re building the indexes) can cause the indexes to fragment again.

You now have every partition defragmented, using the least amount of disk space available, and backed up for all time, with the exception of the new current operating partition. Using this scenario, you do not have to perform a complete database backup of partitioned tables every night, because the majority of your partitioned table is backed up the last time any data is added or updated.

This implementation works even if you don’t perform a rolling partition. If you simply add a new partition every data, and your database continues to grow until you finally give up, this method of backing up the previous partition after a new partition is added results in a highly reduced time for back up processes. As your table continues to move into counts of massive records, this is essential for performance and DRP.

Cheers,

Ben