Jul 19, 2007

What's New In SQL 2008 (Katmai)

I've been looking at some of the new features in the SQL 2008 (Katmai) June CTP today. Here's some of the interesting things I has a look at

1. Multiple Value Inserts



In one statement, it's now possible to insert multiple records, and not by batching changes.

As an example:

insert into factbuyinghabits values (707, 11794, getdate()), (708, 11795, getdate())

will insert 2 records into the factbuyinghabits table (and tell you that 2 records were added).

What if you are inserting data into a table with an identity column? What is the value of @@identity for this statement?

insert into table_2 ([value]) values ('val1'), ('val2')
select @@identity

No surprises - it's the value of the last record inserted.

2 Change Data Capture (Logging)



This is a really nice new feature, but it needs some work. What you can do is track changes at the column level for tables in a database. There's a few steps to follow to set it up:

The database itself needs to be configured by running sys.sp_cdc_enable_db_change_data_capture. This creates a number of tracking tables in the database, adds a role (cdc_admin) and turns on the is_cdc_enabled flag in sys.tables

Tables then need to be specifically marked for tracking using the sys.sp_cdc_enable_table_change_data_capture stored procedure. For example the following starts tracking on all columns for the dbo.FactBuyingHabits table

EXECUTE sys.sp_cdc_enable_table_change_data_capture
@source_schema = N'dbo'
, @source_name = N'FactBuyingHabits'
, @role_name = N'cdc_Admin';

This creates a tracking table specifically for the FactBuyingHabits table (dbo_factbuying_habits_CT) and turns on the is_tracked_by_cdc flag in sys.tables.

So now we're tacking data, how do we see it. We'll, it's a little involved but I'm sure this will improve (or you could convert it to a stored proc). The following SQL returns the changes in the table.

DECLARE @from_lsn binary(10), @to_lsn binary(10);
SELECT @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than or equal', dateadd(d,-1,getdate()));
SELECT @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', getdate());
DECLARE @customeridCol int;
SELECT * FROM cdc.fn_cdc_get_all_changes_dbo_factbuyinghabits(@from_lsn, @to_lsn, 'all update old');

The 'all update old' ensures that not only do all changes get returned, but that updates also show the previous value of the column or row being changed.

One thing that's missing though is determining who made the change. Currently you only get the data, not the login or other tracking information that would really make this handy for auditing DB changes.

Also, in order for this to work, the SQL Server Agent process MUST be running. Failure to do so will queue changes for the agent to process, but won't actually cause any of the data tracking tables to be updated.

3. The Merge Statement



This is COOL! In the scenario where you have a record you want to put in the database but you aren't sure wether the record exists or not, you usually have to do a read of the database (1 round trip) and then either execute an insert or an update statement as appropriate (another round trip).

With the merge statement, you can now do it all in one go. This statement

merge factbuyinghabits fbh
using (select 1175 as customer, 707 as product) as src
on fbh.customerid=src.customer and fbh.productid=src.product
when matched then
update set fbh.lastpurchasedate=getdate()
when not matched then
insert values (707, 1175, getdate());

ensures that if a record for the product and customerid exists then the last purchase date is updated, otherwise a new record is created.

Nice ;-)

5 comments:

  1. About the multiple inserts in one statement: are there any benefits of this feature over the previous individual insert statements in a batch-mode?

    ReplyDelete
  2. As far as I understand it at this stage, the multiple inserts happen as a single statement on the server and the inserts all get completed before indexes are updated. A batch of inserts will only be one round trip, but will still require multiple insert and index updates on the server.

    I'll try and set up a bit of test to see what the performance difference is.

    ReplyDelete
  3. Any success with the perf difference test?

    ReplyDelete
  4. Not yet. I got sidetracked on some other things for a while (tech.ed australia, etc). Thanks for the reminder.

    ReplyDelete
  5. The performance information is now available at http://richardsbraindump.blogspot.com/2007/08/performance-differences-with-new-sql.html. Note that this is CTP performance and likely to improve before RTM.

    ReplyDelete