But When Did We Capture That Changed Data?

Where CDC Can Fall Short


There are two common questions people ask about changed data that CDC doesn’t support very naturally:

  • When did it change?
  • Who changed it?

While it is possible to join to cdc.lsn_time_mapping to get that information, you may be dealing with a box product that doesn’t support that functionality.

Or something.

Hypothetically.

They’re Just Tables


The kinda funny thing about the tables that all of your changed data ends up in is that… they’re regular tables.

They’re not in the sys schema, they’re in the cdc schema. You can alter them in all sorts of ways. You can drop and truncate them if you want. I mean, not in that order, but you get my point. There’s no special protection for them.

That means you can add a column like this to them to track when the rows ended up in there. This also saves you from altering production tables to account for when things change.

ALTER TABLE cdc.dbo_Posts_CT 
    ADD ChangeTime datetime DEFAULT SYSDATETIME();

Oversight


I do think it’s a pretty big oversight to not have a column like this in the table already, but maybe people use CDC in far different ways than I normally see it used.

Or maybe not, and everyone has to come up with some workaround like this to deal with it. It could be that CDC data ends up like a lot of other data, and people get really excited about having it, but never actually look at it.

Thanks for reading!



2 thoughts on “But When Did We Capture That Changed Data?

  1. In our case we add identity columns, as it turns out rows can get inserted out of order by LSNs & sequence numbers, and we needed to read them incrementally without missing anything. Still, it makes me feel uneasy altering sorta-internal stuff. I can only hope it doesn’t cause issues and be supported down the road.

Leave a Reply

Your email address will not be published. Required fields are marked *