Breaking Columnstore Delta Rowgroups

The documentation on delta rowgroups says:

A delta rowgroup is a clustered index that’s used only with columnstore indexes. It improves columnstore compression and performance by storing rows until the number of rows reaches a threshold and are then moved into the columnstore.

This clustered index isn’t chosen by the person who creates the table. It’s a hidden internal column automatically added to delta rowgroups. Perhaps the implementation is similar to that of table spools:

The worktable is structured as if it was defined with a clustered index on zero columns. This means that a 4-byte uniqueifier is added to each row stored except the first.

You know how programmers are: they love reusing old code. If a similar implementation is used for delta rowgroups then it should be possible to see SQL Server errors with the right pattern of data loading. More investigation is required.

Defining the rules

It might be possible to find evidence of this clustered index by using DBCC PAGE. In the T-SQL code below, I create a table clustered columnstore index, insert 500 rows, delete 499 rows, insert 500 more rows, and delete 499 rows again:

There’s only a single page that stores the 2 rows currently held by the table. This can be viewed with the undocumented DBCC PAGE, TF 3604, and the undocumented sys.dm_db_database_page_allocations:

The information that we’re looking for is near the bottom of the result set:

A few things are worth calling out. The 500th row that was inserted into the table has a value of 499 for the “CSILOCATOR” field. This value appears to be stored in little-endian format in the memory dump for Slot 0 Offset 0x60. You can decode the raw value to 499 in T-SQL if desired:

The 1000th row that was inserted into the table into the table has a value of 999 for the CSILOCATOR field. Most importantly, this CSILOCATOR field has a length of four bytes. A typical four byte int in SQL Server has a maximum value of 2147483647. If it’s possible to load billions of rows into a single delta rowgroup then we may run out of values for the CSILOCATOR field.

Of course, a single delta rowgroup cannot hold more than 1048576 rows. As you can see in this example, SQL Server does not always reuse freed up values for the CSILOCATOR field. The table currently has two rows, yet the field has advanced to 999. The right pattern of deletes, inserts, and updates should allow the CSILOCATOR to continue to grow without running out of room for rows in the delta rowgroup.

As a final note, I cannot prove that the CSILOCATOR field corresponds to the clustered index, but it certainly seems to serve the function of a uniqueifier that would be needed for said clustered index.

Running up the score

I need to find a relatively efficient way to advance the CSILOCATOR because I need to do it over 2 billion times, if my theory is correct about the maximum allowed value. Both updating all of the rows in a delta rowgroup and deleting and reinserting advance the CSILOCATOR. I expected that small batch sizes would work best, and they did. For my table’s schema, the sweet spot for updates is about 275 rows and the sweet spot for delete/inserts is about 550 rows. Delete/inserts appeared to be faster than updates for the purpose of constantly reloading the same rows over and over.

Strategies that use multiple CPU cores are possible, but I wanted to do other work on this PC and didn’t want to listen to a loud fan all day. Here’s what the final testing code looked like:

If you’re wondering about the temporary stored procedure creation, it’s a habit that I’ve gotten into whenever I write a T-SQL while loop. While not applicable here, very fast loops can incur ASYNC_NETWORK_IO overhead due to the passing of DONE tokens to the client.

Winning the game

After about seven hours the code reaches its glorious end:

Msg 666, Level 16, State 2, Procedure #p, Line 11 [Batch Start Line 26]
The maximum system-generated unique value for a duplicate group was exceeded for index with partition ID 72057596406595584. Dropping and re-creating the index may resolve this; otherwise, use another clustering key.

I loaded a few more rows into the table until I was unable to insert even one row. Using DBCC PAGE and friends again, we can see that the CSILOCATOR has a very large value of 2147483646.

Issuing a REBUILD does resolve the issue because it wipes away our sins, as REBUILDs often do. Naturally using another clustering key is not an option.

Joe: 1

Microsoft: 0

The cost of playing

My desktop computer has an intel core i5-4670 processor. Intel claims a TDP of 84 watts. Using CPUID HWMonitor, it looks like my CPU uses about 15 W of additional power when running the workload. It’s a single core workload, so I feel that 15 W is reasonable. 15 watts is 0.015 kW, and when used over 7 hours it translates to 0.105 kWh. My last electric bill has a rate of $0.11663 per kWh, so the total cost of this test comes out to about 1.2 cents. I will be sending Erik an invoice.

Final thoughts

It is possible to hit error 666 when maliciously loading data into a columnstore index. It requires loading over 2 billion rows into the same delta rowgroup, so it’s difficult to think of a production scenario that would lead to this outcome. For those of you with eccentric workloads that give cause for concern, you can roughly check if you are running of IDs by running a query similar to the following for each columnstore table:

Thanks for reading!

1 thought on “Breaking Columnstore Delta Rowgroups”

Leave a Reply

Your email address will not be published. Required fields are marked *