Editorials

Deleting Data

Many systems don’t like to physically delete data. I find this practice has one or more of the following reasons:

  • If I mark a record as deleted with a status flag, I can bring it back easily
  • Referential integrity requires me to delete data in relationship hierarchy order, or implement cascading deletes
  • There may be a need to keep the deleted data for auditing purposes in the future. By marking a record as deleted, I can know what the contents are of the deleted data.

There are a number of other reasons for physically deleting data instead of using a deleted status.

  • It takes up disk space
  • It can slow down query performance if there is a lot of obsolete data
  • It makes queries more complicated because you have to provide a filter for a deleted status
  • It can confuse people when they first access the data
  • Bad queries may be written by those not knowing a filter for deleted is required

Either implementation has its pros and cons. In my perfect world, LOL, I prefer to audit outside of the table containing the data. Then I have an audit trail of the data before and after deletion. I know who deleted the data and when they performed the action. Additionally, there is no further work necessary to maintain and exclude deleted data. It is physically deleted.

I don’t find anything really wrong with any of these implementations. They all have their benefits. Do you have a preference? If so, would you share your thoughts and concerns with us by leaving a comment, or dropping an email to btaylor@sswug.org.

Cheers,

Ben