Troubleshooting Security Cache Issues: USERSTORE_TOKENPERM And TokenAndPermUserStore

TokenAndPermUserStore


I’ve been seeing this issue on newer versions of SQL Server, from 2016 to 2019, particularly where some form of Auditing is running. What happens next is either the plan cache totally clearing out, and/or queries taking a long time to compile.

I’m still digging in more on the “why” here, but I have a decent amount of confidence in what I’ve found so far. This has occurred on several servers at different client sites, and doesn’t appear to be isolated.

Anyway, using this query to hit the RING_BUFFER_SECURITY_CACHE was helpful in figuring out when things were happening.

You can see a limited historical view of cache sizes and the number of entries in those caches.

SELECT 
    event_time = 
        DATEADD
        (
            MILLISECOND, 
            x.xml_record.value('(//Record/@time)[1]', 'BIGINT') - osi.ms_ticks, 
            GETDATE()
        ),
    tokenandperm_userstore_gb = 
        x.xml_record.value('(//Record/TokenAndPermUserStore/@pages_kb)[1]', 'BIGINT') / 1024. / 1024.,
    acr_cachestores_gb = 
        x.xml_record.value('(//Record/ACRCacheStores/@pages_kb)[1]', 'BIGINT') / 1024. / 1024.,	
    xml_record
FROM 
(
    SELECT TOP (5000)
    	timestamp,
        xml_record = 
    	    CONVERT
            (
                xml, 
                record
            )
    FROM sys.dm_os_ring_buffers AS dorb
    WHERE dorb.ring_buffer_type = 'RING_BUFFER_SECURITY_CACHE'
    ORDER BY timestamp DESC
) AS x
CROSS JOIN sys.dm_os_sys_info osi
ORDER BY x.timestamp DESC
OPTION(RECOMPILE);

Memory Clerks


Likewise, this much shorter query can be used to see how large your security cache currently is.

The ring buffers are written to at intervals, and may not be up to date.

SELECT 
    TokenAndPermUserStore = 
        CONVERT(DECIMAL(38, 2), (pages_kb / 1024. / 1024.))
FROM sys.dm_os_memory_clerks
WHERE type = 'USERSTORE_TOKENPERM'
AND   name = 'TokenAndPermUserStore'

This is a more current view of things, and can be used to trend cache sizes. It will only work on SQL Server 2012 and up.

If you’re on a version of SQL Server prior to that, that’s your problem.

“Tipping Points”


While I don’t have any precise guidance on when the size or entries will cause an issue, it seems that the cache needs to hit several GB or more.

If it regularly does that and you experience performance issues when it does, then you might try running this to clear out the security cache:

DBCC FREESYSTEMCACHE('TokenAndPermUserStore');

I’ve tried using Trace Flags 4610 and 4618, and found 4610 made things worse, and 4618 made things exactly the same.

YMMV.

Because I’ve found a job to fix things more reliable, I’ve published one on GitHub. You can stick it in an agent job. You can run it on whatever schedule you want, at whatever cache size threshold you want, and it will log its activity to a table.

Reproductive


My friend Josh (b|t) hooked me up with a way to repro a growing security cache, if you want to test things out.

IF NOT EXISTS
(
    SELECT
        1/0
    FROM sys.database_principals AS dp
    WHERE dp.type = 'A'
    AND   dp.name = 'your_terrible_app'
)
BEGIN
    CREATE APPLICATION ROLE your_terrible_app
        WITH DEFAULT_SCHEMA = [dbo], 
        PASSWORD = N'y0ur_t3rr1bl3_4pp';
END

DECLARE 
    @counter INT = 0;

DECLARE 
    @holder table
    ( 
        id INT PRIMARY KEY IDENTITY,
        cache_size DECIMAL(10,2),
        run_date DATETIME 
     );

WHILE 
    @counter <= 20000
    BEGIN
    SET NOCOUNT ON;
        
        DECLARE @cronut VARBINARY(8000);
        DECLARE @bl0b_eater SQL_VARIANT;

        EXEC sys.sp_setapprole 
            @rolename = 'your_terrible_app',
            @password = 'y0ur_t3rr1bl3_4pp',
            @fCreateCookie = true,
            @cookie = @cronut OUTPUT;

        SELECT 
            @bl0b_eater = USER_NAME();

        EXEC sys.sp_unsetapprole 
            @cronut;

        SELECT 
            @bl0b_eater = USER_NAME();

        IF @counter % 1000 = 0
        BEGIN
            INSERT 
                @holder(cache_size, run_date)
            SELECT 
                cache_size = 
                    CONVERT
                    (
                        decimal(38, 2), 
                        (domc.pages_kb / 1024. / 1024.)
                    ),
                run_date = 
                    GETDATE()
            FROM sys.dm_os_memory_clerks AS domc
            WHERE domc.type = 'USERSTORE_TOKENPERM'
            AND   domc.name = 'TokenAndPermUserStore';
            RAISERROR('%i', 0, 1, @counter) WITH NOWAIT;
        END;

        SELECT 
            @counter += 1;
    END;

    DBCC FREESYSTEMCACHE('TokenAndPermUserStore');

    SELECT 
        h.* 
    FROM @holder AS h 
    ORDER BY h.id;

    DBCC TRACESTATUS;
    
    DBCC TRACEON(4610, -1);
    DBCC TRACEOFF(4610, -1);

    DBCC TRACEON(4618, -1);
    DBCC TRACEOFF(4618, -1);

    DBCC TRACEON(4610, 4618, -1);
    DBCC TRACEOFF(4610, 4618, -1);

Thanks for reading!

Common Query Plan Patterns For Joins: Index Union

I’m On My Smoke Break


Index union is a little bit different from index intersection. Rather than joining two indexes together, their result sets are concatenated together.

Just like you’d see if you wrote a query with union or union all. Crazy, huh?

As with index intersection, the optimizer has a choice between concatenation and merge join concatenation, and lookups back to the clustered index are possible.

Here are the indexes we’ll be working with:

CREATE INDEX po ON dbo.Posts
    (OwnerUserId);

CREATE INDEX ps ON dbo.Posts
    (Score);

Let’s jump right in!

Big Run


The main difference between when you’re likely to get index union vs index intersection is if your predicates use AND or OR. Index intersection is typically from AND predicates, and index union is typically from OR predicates. So let’s write some OR predicates, shall we?

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 0
OR    p.Score > 10;

First, using one equality and one inequality predicate:

concaturday

Since we lose the ordering on the inequality predicate, this is our plan. Note he hash match aggregate after the concatenation, too.

No arrow for that, though.

More Wedding


If we use two equality predicates, we’ll get more orderly operations, like merge join concatenation and a stream aggregate.

delancey

What a nice plan.

Wafting


If we add in a column that’s not in either index, we’ll see a similar lookup plan as yesterday’s query.

SELECT
    p.PostTypeId,
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
OR    p.Score > 10
GROUP BY p.PostTypeId;

Woah baby look at all that sweet parallelism.

boing boing boing

Big Time


Both index intersection and index union are underappreciated query plan shapes, and represent some pretty cool optimizer strategies to make your queries not stink.

Of course, there are cases where it might make more sense to have one index that covers all necessary key columns, but it’s not always possible to have every index we’d like.

Tomorrow we’ll look at lookups, and some of the finer details of them.

Thanks for reading!

Common Query Plan Patterns For Joins: Index Intersection

Incomplete


The optimizer has a few different strategies for using multiple indexes on the same table:

  • Lookups (Key or Bookmark)
  • Index Intersection (Join)
  • Index Union (Concatenation)

Lookups are between a nonclustered index and either the clustered index or heap, and index intersection and union are between two nonclustered indexes.

Some plans may even have both! How exciting. How very exciting.

Today we’re going to look at index intersection.

Intervals


First, we need a couple useful-ish indexes.

CREATE INDEX pc ON dbo.Posts
    (CreationDate);

CREATE INDEX pl ON dbo.Posts
    (LastActivityDate);

Next, we’ll need a query with AND predicates on both of those columns:

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p
WHERE p.CreationDate >= '20130101'
AND   p.LastActivityDate <= '20180101';

The query plan looks like this:

harsh join

Notice! Both of our nonclustered indexes get used, and without a lookup are joined together.

Because the predicates are of the inequality variety, the join columns are not ordered after each seek. That makes a hash join the most likely join type.

Other Joins


You are unlikely to see a merge join here, because it would require sorting both inputs. You can force one to see for yourself:

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p
WHERE p.CreationDate >= '20130101'
AND   p.LastActivityDate <= '20180101'
OPTION(MERGE JOIN);
murders and executions

If you attempt to hint a loop join without hinting to use the nonclustered indexes, you will just get one scan of the clustered index.

If you attempt a loop join hint with both indexes hinted, you will get a query processor error.

Serves you right.

Equality Predicates


If you have equality predicates, you may see a merge join naturally.

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p WITH(INDEX (pc, pl))
WHERE p.CreationDate = '2010-08-05 10:04:10.137'
AND   p.LastActivityDate = '2008-12-25 02:01:25.533';

Look at that sweet double index hint, eh?

no arrows here

Since equality predicates preserve ordering, we don’t need to sort anything here.

You’re welcome.

Plus Lookups


If we request a column that is not present in either index, we’ll get a likely more familiar query plan element.

SELECT
    p.PostTypeId,
    COUNT_BIG(*) AS records
FROM dbo.Posts AS p
WHERE p.CreationDate >= '20131201'
AND   p.LastActivityDate <= '20090101'
GROUP BY p.PostTypeId;

Hooray.

arrows are back

We’re able to use a lookup to retrieve the PostTypeId column from the clustered index. Pretty neat.

For more information on this type of plan, check out Paul White’s post here.

Thanks for reading!

Common Query Plan Patterns For Joins: No Equality Predicate

Data Dribbling


The longer you work with data, the more weird stuff you see. I remember the first time I saw a join on a LIKE I immediately felt revolted.

But not every query has an available equality join predicate. For various reasons, like poor normalization choices, or just the requirements of the query, you may run into things like I’m about to show you.

If there’s a headline point to this post, it’s that joins that don’t have an equality predicate in them only have one join choice: Nested Loops. You cannot have a Hash or Merge join with this type of query. Even with the best possible indexes, some of these query patterns just never seem to pan out performance-wise.

There Are Pains


First up, some patterns are less awful than others, and can do just fine with a useful index. Sounds like most other queries, right?

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Posts AS p
    ON p.LastActivityDate BETWEEN u.CreationDate 
                              AND u.LastAccessDate
WHERE u.Id = 22656;

Say we want to count all the posts that were active during Jon Skeet’s active period. This is a simplified version of some interval-type queries.

Even without an index, this does okay because we’re only passing a single row through the Nested Loops Join.

fussy

If our query needs are different, and we threaten more rows going across the Nested Loops Join, things deteriorate quickly.

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Posts AS p
    ON p.LastActivityDate BETWEEN u.CreationDate 
                              AND u.LastAccessDate
WHERE u.Id BETWEEN 22656 AND 22657;

The funny thing here is that there is still only one row actually going through. We’re just threatening additional rows.

SELECT
    u.*
FROM dbo.Users AS u
WHERE u.Id BETWEEN 22656 AND 22666;

How about those identity values, kids?

specify

Here’s what happens, though:

suspect device

There Are Indexes


Indexing only LastActivityDate “fixes” the multi-row query, but…

CREATE INDEX p ON dbo.Posts(LastActivityDate);
regressed

Parallelism being deemed unnecessary slows the access of the Posts table down, which is something we’ve seen before in other entries in this series. That costs us roughly 500 milliseconds in the first query, but saves us about 30 seconds in the second query. I’d probably take that trade.

And Then There Are Pains


Where things get more painful is with correlations like this.

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Posts AS p
    ON u.DisplayName LIKE '%' + p.LastEditorDisplayName + '%'
WHERE  p.PostTypeId = 1;

This query runs basically forever. I gave up on it after several minutes, so you’re only getting the estimated plan.

ehhh

The query plan asks for an index on the Users table! Surely this will solve all our problems.

CREATE INDEX ud ON dbo.Users(DisplayName);

This query, likewise, doesn’t seem too keen on finishing. But we get a new index request along with it.

helipad

This never finishes either. By “never” I mean “I’d like to write about more things so I’m not sitting around waiting for this.”

Sorry.

And Then There Are More Indexes


An index on PostTypeId that includes LastEditorDisplayName. I’m going to flex a little on SQL Server and create this index instead:

CREATE INDEX pl ON dbo.Posts(LastEditorDisplayName)
WHERE PostTypeId = 1

Unfortunately, all this does is change which index is used. It doesn’t improve the performance of the query at all.

But this series is about query plan patterns, so let’s talk about some of those.

Amoral


Like I mentioned at the start of the post, the optimizer only has a single join strategy for queries with no equality predicates on the join condition. This can lead to very bad performance in a number of cases.

In cases where a single row is being passed across, and the query plan shows a single seek or scan on the inner side of the Nested Loops Join, performance will likely be… acceptable.

When you see Spools or Constant Scans on the inner side of the Nested Loops Join, this could be cause for concern. For some details on that, have a look at these links:

After all, even the LIKE predicate is tolerable with appropriate indexes and a single row from the outer side.

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Posts AS p
    ON u.DisplayName LIKE '%' + p.LastEditorDisplayName + '%'
WHERE u.Id = 22656
AND   p.PostTypeId = 1;
wild

Thanks for reading!

Common Query Plan Patterns For Joins: DOP and Bitmaps

1/10


Bitmaps can be really useful in parallel hash and merge join plans. They can be used like sargable, but not seekable, predicates.

Where they get created and where they get used is a bit different,

triple x

10/10


When bitmaps do their job, you can tell. For example, here’s an example of an effective bitmap:

impressionable

At the index scan, we filter out all but around ~40k rows from the Users table.

That’s uh… Eh you can find a percentage calculator.

0/10


When they don’t, you can also tell. This bitmap hardly eliminates any rows at all.

down down down

But wait! This query runs at DOP 4. You can tell by looking at the number of executions.

Who runs queries at DOP 4?

Fools.

40/40


At higher DOPs, that useless bitmap becomes much more effective.

OBSERVE IF YOU WILL

At DOP 8, we filter out about 600k rows, and at DOP 16 we filter out about 830k rows.

99/1


Like many operators in query plans, Bitmaps aren’t parallel “aware”, meaning there will be one Bitmap per thread.

At times, if you find a Bitmap underperforming at a lower DOP, you may find some benefit from increasing it.

Of course, you may also find general relief from more threads working on other operators as well. Sometimes more CPU really is the answer to queries that process a whole bunch of rows.

Thanks for reading!

Common Query Plan Patterns For Joins: Semi Join Problems

But You Should


This post isn’t meant to dissuade you from using EXISTS or NOT EXISTS when writing queries. In fact, most of the time I think they make a lot of sense.

But weird things can happen along the way, especially if you don’t have supporting indexes, or if supporting indexes aren’t chosen by the optimizer for various reasons.

In this post, I’m going to show you a query plan pattern that can occur in semi-join plans, and what you can do about it.

Row Goalies


This can happen without a TOP in the query, I’m only using it because it gets me the plan shape I care about quickly, and returns results “quickly” enough.

SELECT TOP (1)
    c.*
FROM dbo.Comments AS c
WHERE NOT EXISTS
(
    SELECT
        1/0
    FROM dbo.Votes AS v
    WHERE c.PostId = v.PostId
)
ORDER BY c.CreationDate DESC;

The part of the plan that we care about is on the inner side of the semi join, where the TOP is applied in a funny way.

just party

This is a very good example of where query costing can be quite off. The part of the plan that is responsible for most of the runtime is estimated to cost 0% of the total cost.

Closer


Here’s what’s going on in those two operators!

rock show

The optimizer thinks it can find a matching row in three executions, but it takes 70.

It also takes reading 2,630,801,081 rows to return 49 rows. There are 52,928,700 rows in the table.

Yes, if you multiply 52928700 * 49 = 2630801081. Technically it’s 49.7, but the plan shows rounded numbers. Close enough for the optimizer, so they say.

Index Not Exists


Without an index, a hash join plan would have been a much better option, which we can get by adding an OPTION(HASH JOIN) hint to the query.

freeloader

While ~12 seconds is great, we can do better. I’ve heard about these things called indexes.

How do we indexes?

An Index, You Say


It looks straight forward to create an index that should get us the plan we want, if we create it on the PostId column on the Votes table.

CREATE INDEX v ON dbo.Votes(PostId);

 

Assuming that the plan shape otherwise stays the same, we’ll scan the Comments table, Sort by PostId, and then do a Seek into the Votes table on the inner side of the join.

Right?

wrong

While it is quite a bit faster, it’s not at all what we want. And hey, look, the costs and times here are all screwy again.

Stop looking at costs.

Adding a FORCESEEK hint gets us the plan shape we care about, of course:

SELECT TOP (1)
    c.*
FROM dbo.Comments AS c
WHERE NOT EXISTS
(
    SELECT
        1/0
    FROM dbo.Votes AS v WITH(FORCESEEK)
    WHERE c.PostId = v.PostId
)
ORDER BY c.CreationDate DESC;

Along with a much faster runtime:

premium tears

Gosh Wrong


If you’re keeping score at home, the optimizer chose:

  • A TOP > Scan that took 30 seconds instead of a Hash Join that took 11 seconds
  • A Hash Join that took 5.2 seconds instead of a Top > Seek that took 2.2 seconds

But I bet I know what you’re thinking. We need an index on the Comments table. And I bet you’re thinking it should look something like this:

CREATE INDEX c ON dbo.Comments
    (PostId, CreationDate)
INCLUDE
    (Score, UserId, Text);

But, prepare you biggest internet sigh, that just won’t work out.

and tears

Much like above, if we add a FORCESEEK hint, we get a better/faster plan that looks a lot like the plan above.

Flip That Script


The only index that gets us a respectable execution plan without a bunch of hints is this one:

CREATE INDEX c ON dbo.Comments
    (CreationDate, PostId)
INCLUDE
    (Score, UserId, Text);

Which gets us this execution plan:

what i want

Thanks for reading!

Common Query Plan Patterns For Joins: OR Clauses

Least Favorite


This is one of my least favorite query patterns, because even with appropriate indexes, performance often isn’t very good without additional interventions.

Without indexes in place, or when “indexes aren’t used”, then the query plans will often look like one of these.

Maybe not always, but there are pretty common.

Merge Interval


SELECT
    COUNT_BIG(*) AS records
FROM dbo.Users AS u
JOIN dbo.Posts AS p 
    ON (p.OwnerUserId = u.Id
        OR p.LastEditorUserId = u.Id);
here comes the pain

We start off with a scan of the Posts table, and a Nested Loops Join to a… Nested Loops Join? That’s weird.

work it

From the Merge Interval to the right, there’s a lot of additional operators. Those two Constant Scan operators represent virtual tables containing the two columns from Posts being used in the join.

One for OwnerUserId, and one for LastEditorUserId. The Sort and Merge interval are a pseudo-attempt at ordering the concatenated results and removing duplicate ranges. I don’t think I’ve ever seen them be terribly effective.

This plan takes on this shape because the Users table has a seekable index on the Users table on the Id column.

Loops and Lazy Spools


With no usable indexes on either side, the plan will often take on a shape like this.

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Comments AS c
JOIN dbo.Posts AS p 
    ON (p.OwnerUserId = c.UserId
        OR p.LastEditorUserId = c.UserId);
rebort

This is an admittedly weird plan. Weird because usually when there’s a Lazy Spool on the inner side of a Nested Loops Join, there’s some Sorting of data on the outer side.

The Nested Loops Join here is not Optimized, and does not do an Ordered Prefetch. It’s odd to me that the Hash Match Aggregate on the outer side isn’t a Sort > Stream Aggregate.

This is usually done to maximize the Spool’s efficiency, and cut down on the Spool needing to execute child operators.

For example, let’s say the numbers 1 through 10 came out of the Comments table, and there were 10 of each. If they were in order, the Spool would initially fill with the values from Posts for  1, and then the contents of the Spool would get used for the next 9 duplicate values of 1.

The process would repeat for the rest of the numbers, with the Spool truncating itself when it sees a new value.

Let’s Add Indexes


You may have noticed that I’ve only been using estimated plans up until now. That’s because both of these queries run way too slowly to deal with actual plans for.

The optimal indexes we need to make that not the case any more look like this:

CREATE INDEX po
ON dbo.Posts
    (OwnerUserId);

CREATE INDEX pl
ON dbo.Posts
    (LastEditorUserId);

CREATE INDEX cu
ON dbo.Comments
    (UserId);

ORnery


If we don’t change the queries at all, both plans will use the Merge Interval shape. It’s only somewhat beneficial to the Users/Posts query, which now finishes in about 40 seconds. The Comments/Posts query runs for… I dunno. I let it go for three hours and my CPUs were maxed out the whole time and things got unusable.

If you’re going to leave the OR in, you need to use a FORCESEEK hint. More specifically, you need to use the hint on the table that has different columns in the OR clause. Otherwise, the plan shape goes all to crap (Merge Interval).

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Users AS u  
JOIN dbo.Posts AS p WITH(FORCESEEK)
    ON (p.OwnerUserId = u.Id
        OR p.LastEditorUserId = u.Id);
GO 

SELECT
    COUNT_BIG(*) AS records
FROM dbo.Comments AS c
JOIN dbo.Posts AS p WITH(FORCESEEK)
    ON (p.OwnerUserId = c.UserId
        OR p.LastEditorUserId = c.UserId);
GO

With that hint in place, both queries will take on a similar, non-disastrous shape.

oh hey seeks

Both queries will take around 5 seconds.

Leaving The Or Out


The usually-better rewrite is to use UNION ALL to separate the OR out, especially if you don’t have good indexes in place.

With good indexes in place (like above), both benefit from the FORCESEEK hint as well

SELECT
    SUM(x.x) AS records
FROM 
(
    SELECT
        COUNT_BIG(*) AS x
    FROM dbo.Comments AS c
    JOIN dbo.Posts AS p WITH(FORCESEEK)
        ON p.OwnerUserId = c.UserId
    
    UNION ALL

    SELECT
        COUNT_BIG(*) AS x
    FROM dbo.Comments AS c
    JOIN dbo.Posts AS p WITH(FORCESEEK)
        ON p.LastEditorUserId = c.UserId
    AND p.OwnerUserId <> c.UserId
) AS x;

SELECT
    SUM(x.x) AS records
FROM 
(
    SELECT
        COUNT_BIG(*) AS x
    FROM dbo.Users AS u
    JOIN dbo.Posts AS p WITH(FORCESEEK)
        ON p.OwnerUserId = u.Id
    
    UNION ALL

    SELECT
        COUNT_BIG(*) AS x
    FROM dbo.Users AS u
    JOIN dbo.Posts AS p WITH(FORCESEEK)
        ON p.LastEditorUserId = u.Id
    AND p.OwnerUserId <> u.Id
) AS x;

Not All ORs


I am usually not a fan of OR predicates in JOINs. There often isn’t great indexing in place to make things fast, or support the FORCESEEK hints.

If you see query plans with these patterns in them, you should keep an eye out for them. You may have a pretty easy performance win on your hands, either via indexing and hints, or rewriting to a UNION ALL.

Thanks for reading!

Common Query Plan Patterns For Windowing Functions: Ranges, Rows, and Spills

What Why


Certain windowing functions allow you to specify more precise windows to perform calculations on, and many of them allow you to specify an empty OVER() clause to get a global aggregate for a result set.

The performance difference between RANGE and ROWS is fairly well-documented, and I’m mostly covering it here to show you the difference in execution plans.

There’s also some different behavior when Batch Mode shows up around memory usage, which I was amused by.

We’re going to be using this index to help our queries move along faster.

CREATE INDEX c
ON dbo.Comments
(
    UserId,
    CreationDate,
    Score
);

RANGEr Zone


If you don’t specify the type of window you want, by default you’ll use RANGE. I’m using it here to be clear about which I’m covering.

WITH Comments AS 
(
    SELECT
        SUM(c.Score) OVER
        (
            PARTITION BY
                c.UserId
            ORDER BY 
                c.CreationDate
            RANGE BETWEEN UNBOUNDED PRECEDING 
                      AND CURRENT ROW
        ) AS n
    FROM dbo.Comments AS c
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n < 0
OPTION(USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140'));

Even with an index to help us along, this query runs for a very long time. In fact it’s been running for the entire time I’ve been writing this particular blog post.

The longer it runs, the more space-filler I have to type, and the more post quality suffers. If this post is terrible, you can blame the above query. Or SQL Server.

Whatever.

Three minutes later:

cpu hound

In my experience, this is the query plan pattern that shows up when you use the RANGE/default specification.

emergency

The first Segment is for the Partition By elements, and the second Segment is for both Partition By and Order By.

Depending on your ideas about our time on Earth, and the afterlife, this query could be considered a poor use of finite time.

ROW Boat


Switching over to the ROWS specification, performance is much different, and we get a different execution plan.

WITH Comments AS 
(
    SELECT
        SUM(c.Score) OVER
        (
            PARTITION BY
                c.UserId
            ORDER BY 
                c.CreationDate
            ROWS BETWEEN UNBOUNDED PRECEDING 
                     AND CURRENT ROW
        ) AS n
    FROM dbo.Comments AS c
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n < 0
OPTION(USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140'));

I only have to wait about 10 seconds for this one, with some efficiency gained by the query going parallel. Even if I force the query to run at DOP 1, it runs in about 25 seconds. Is 15 seconds of duration worth the 7 additional threads? I don’t know.

Again, it depends largely on your belief system. You may find this to be a good way to kill time until you end up somewhere better. You might not.

wisdumb

The relevant difference in the query plan here (again, based on my observations over the years) is that rather than two consecutive Segment operators, we have Segment > Sequence Project > Segment > Window Spool.

In this case, the Segment operators only list UserId in the “Group By” portion of the query plan. The Sequence Project uses ROW_NUMBER to define the frame.

gifts!

The reason for the performance difference is that the RANGE/default specification uses an on-disk worktable for the Window Spool, and the ROWS specification uses one in-memory. If you were to compare memory grant details for the two query plans, you’d see the RANGE/default query had 0 for all memory grant categories, and the ROWS specification asks for a ~6GB memory grant.

There is a limit here though. If you go beyond 10,000 rows, an on-disk table will be used. This query will run for just about a minute:

WITH Comments AS 
(
    SELECT
        SUM(c.Score) OVER
        (
            PARTITION BY
                c.UserId
            ORDER BY 
                c.CreationDate
            ROWS BETWEEN 9999 PRECEDING 
                     AND CURRENT ROW
        ) AS n
    FROM dbo.Comments AS c
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n < 0
OPTION(USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140'));

Unfortunately, details of the Window Spool don’t indicate that any I/O was done. This seems a little bit weird, since Many to Many Merge Joins will show I/O details for the worktable.

Anyway. What about that Batch Mode?

Batcher Up


In Batch Mode, the window specification doesn’t tend to make a lot of difference when it comes to performance.

bangers

Both of these queries ask for 1MB of memory, but take ~3 seconds to scan the table single-threaded.

Again, your trade-off here will be deciding whether or not it’s worth scanning the table faster with parallel threads but incurring additional memory grant with the Sort.

Even though you’re Sorting Sorted data.

Go figure.

OVER It


If you use an empty OVER() clause, you can’t specify rows or ranges. You use the default/RANGE specification. That typically means queries using that in Row Mode vs Batch Mode will be much slower.

WITH Comments AS 
(
    SELECT
        COUNT_BIG(*) OVER () AS n
    FROM dbo.Comments AS c
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n < 0
OPTION(USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140'));

Also, the query plans are atrocious looking.

take your time

Batch Mode does solve a lot of the issues with them, but at the cost of additional memory usage, and the potential for spilling.

scampi

Batch Mode Memory Grant Feedback will fix the spill, and the query will run about 2-3 seconds faster.

Thanks, probably.

Some Grants Are Bigger Than Others


The non-spilling memory grant for the COUNT query is about 800MB.

one right here

The non-spilling grant for the SUM query is about 1100MB:

friendly

Apparently the reason for the additional memory is because behind the scenes sum also does a count, and returns NULL if it’s zero.

SELECT
    SUM(x.x) AS the_sum,
    COUNT_BIG(x.x) AS the_x_count,
    COUNT_BIG(*) AS the_full_count
FROM 
(
    SELECT CONVERT(int, NULL) AS x
) AS x;

Thanks for reading!

Common Query Plan Patterns For Windowing Functions: Column Selection Matters

Not A Doctor


All of our previous queries looked about like this:

WITH Comments AS 
(
    SELECT
        ROW_NUMBER() OVER
        (
            PARTITION BY
                c.UserId
            ORDER BY 
                c.CreationDate
        ) AS n
    FROM dbo.Comments AS c 
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n = 0;

The only columns that we were really selecting from the Comments table were UserId and CreationDate, which are an integer and a datetime.

Those are relatively easy columns to deal with, both from the perspective of reading and sorting.

In order to show you how column selection can muck things up, we need to create a more appropriate column store index, add columns to the select list, and use a where clause to  restrict the number of rows we’re sorting. Otherwise, we’ll get a 16GB memory grant for every query.

Starting Point


Selecting no additional columns:

WITH Comments AS 
(
    SELECT
        ROW_NUMBER() OVER
        (
            PARTITION BY
                c.UserId
            ORDER BY 
                c.CreationDate
        ) AS n
    FROM dbo.Comments AS c
    WHERE c.CreationDate >= '20131201'
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n = 0;

On a second run of the query, after a memory grant feedback correction, we end up with a plan with these details:

burgers

It takes us 3 milliseconds to scan the column store index, and we get a 24MB memory grant. This is good. I like this.

Darn Strings


Our second query looks like this. We’re selecting all the columns from the Comments table.

WITH Comments AS 
(
    SELECT
        c.Id, 
        c.CreationDate, 
        c.PostId, 
        c.Score, 
        c.Text, 
        c.UserId,
        ROW_NUMBER() OVER
        (
            PARTITION BY
                c.UserId
            ORDER BY 
                c.CreationDate
        ) AS n
    FROM dbo.Comments AS c
    WHERE c.CreationDate >= '20131201'
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n = 0;

A first run of the query, before memory grant feedback fixes things for us, asks for a 16GB memory grant. Without this mechanism in place, we’ll keep asking for the same unproductive grant. If you don’t have batch mode and enterprise edition, this is the scenario you’ll face over and over again.

When memory grant correction kicks in, we end up with a 456MB memory grant.

Quite an adjustment, eh?

kick me

We also end up taking 125ms to scan the table with parallel threads, up from 3 milliseconds with a single thread. Of course, the issue here is mostly the Text column.

Strings were a mistake.

No Strings Attached


If we select all the columns other than the string, we’ll end up with a very similar set of metrics as the first plan.

mexico

If we want to maintain those metrics, but still show the Text column, we’ll need to do something like this:

WITH Comments AS 
(
    SELECT
        c.Id, 
        c.CreationDate, 
        c.PostId, 
        c.Score,  
        c.UserId,
        ROW_NUMBER() OVER
        (
            PARTITION BY
                c.UserId
            ORDER BY 
                c.CreationDate
        ) AS n
    FROM dbo.Comments AS c
    WHERE c.CreationDate >= '20131201'
)
SELECT 
    c.*,
    c2.Text
FROM Comments AS c
JOIN dbo.Comments AS c2
    ON c.Id = c2.Id
WHERE c.n = 0;
big hands

Using a self-join, and getting the initial set of columns we care about,  then getting the Text column at the end means we avoid some of the the knuckle-headedness of strings in databases.

Deep Drink


This pattern applies to more than just windowing functions, but it’s a performance issue I have to tune pretty often for people using paging queries.

In tomorrow’s post, we’ll look at another rather unfortunate thing that I see people messing up with windowing functions, and how you can spot it looking at query plans.

Thanks for reading!