Jul 26, 2007

Wish List: Obsolete Rows in SQL 2008

This came up during a discussion about meta data for table rows on the Readify internal mailing list.

Wouldn't it be nice to have a way to logically delete rows in a table, i.e. make it so that they were available for looking up information like descriptions and so forth but in general did not come as results in normal select statements.  It would be even better if obsolete rows could be excluded when trying to establish new foreign key relationships.

Here's the feedback that has been placed on the SQL 2008 CTP site:

SQL Server could have:

OBSOLETE FROM sometable WHERE someclause

and those rows would no longer be returned by

SELECT columns FROM sometable

unless you also added:

SELECT columns FROM sometable WITH OBSOLETE

Rows could be reinstated by:

REINSTATE FROM sometable WHERE someclause

Foreign key relationships could then be defined as not permitting new references to obsolete rows. No doubt performance gains could also be obtained where the engine knows which rows won't be updated. Part of this could be done by partitioning today but it's not the full story.  

 

What do you think?

P.S. If you are on the SQL 2008 CTP program then go to the Microsoft Connect site and vote for the Obsolete/Reinstate feature via this link.

2 comments:

  1. Sounds like a business level not data level constraint.

    Consider another evolution would be to store a period of activity bounded by two timestamps in the data layer (per record).

    And then the combinations of time bounded activity and obsolescence and so on ....

    What are the arguments for not building it into a data access layer ?

    ReplyDelete
  2. Doing the filtering in a data access layer means that SQL processes all rows, every time, and means that your DAL has to go through the returned dataset and remove the rows you don't want or it has to know about an extra column in your table that it needs to filter with. It also has to update that column every time you want to do anything with logically deleted rows.

    Further, if you use a reporting tool you have to ensure that all your reports are aware of the way the table works and that they filter data approriately as well.

    I don't believe it's purely a business level constraint to mark rows as obsolete. To me it's more of a data level constraint - do I still want this row for regular use, do I want to delete it forever or do I want to ignore it in most situations and just keep it around for the odd occassions when I need to go back and reference it.

    In the database it could be implemented as an hidden Obsolete table that only gets referenced when required. It keeps the main tables and indexes smaller with flow on benefits to performance, etc.

    Obviously this can be implemented at either level, but to me it makes more sense to have it in the database.

    P.S. I also like the fact that you then don't need an "obsolete" column on your tables. It keeps the table definition cleaner.

    ReplyDelete