A Query To Help You Find Plans With Eager Index Spools

Yawn And Yawn


I’ve written a bunch about Eager Index Spools, and how to fix them, but I’ve always sort of left the “finding” part up to you, or pointed curious tuners to tools like sp_BlitzCache.

Recently though, I worked with a client who had Eager Index Spools so frequently that we needed to track them down specifically.

This is the plan cache query that I used to do it — they didn’t have Query Store enabled — and I wanted to share it.

WITH 
    XMLNAMESPACES 
('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS x),
    plans AS 
(
    SELECT TOP (10)
        deqs.query_plan_hash,
        sort = 
            SUM(deqs.total_worker_time / deqs.execution_count)
    FROM sys.dm_exec_cached_plans AS decp
    JOIN sys.dm_exec_query_stats AS deqs
        ON decp.plan_handle = deqs.plan_handle
    CROSS APPLY sys.dm_exec_query_plan(decp.plan_handle) AS deqp
    CROSS APPLY deqp.query_plan.nodes('//x:RelOp') AS r (c)
    WHERE  r.c.exist('//x:RelOp[@PhysicalOp="Index Spool" and @LogicalOp="Eager Spool"]') = 1
    AND    EXISTS
           (      
               SELECT 
                   1/0
               FROM sys.dm_exec_plan_attributes(decp.plan_handle) AS pa 
               WHERE pa.attribute = 'dbid'
               AND   pa.value > 4
           )   
    GROUP BY deqs.query_plan_hash
    ORDER BY sort DESC
)
SELECT
    deqp.query_plan,
    dest.text,
    avg_worker_time = 
        (deqs.total_worker_time / deqs.execution_count),
    deqs.total_worker_time,
    deqs.execution_count
FROM sys.dm_exec_cached_plans AS decp
JOIN sys.dm_exec_query_stats AS deqs
    ON decp.plan_handle = deqs.plan_handle
CROSS APPLY sys.dm_exec_query_plan(decp.plan_handle) AS deqp    
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE EXISTS
(
    SELECT
        1/0
    FROM plans AS p
    WHERE p.query_plan_hash = deqs.query_plan_hash
)
ORDER BY avg_worker_time DESC
OPTION(RECOMPILE, MAXDOP 1);

It’s maybe not the prettiest thing in the world, but it got the job done.

Thanks for reading!

Documentation for dm_db_missing_index_group_stats_query

No, It’s New


When I was checking out early builds of SQL Server 2019, I noticed a new DMV called dm_db_missing_index_group_stats_query, that I thought was pretty cool.

It helped you tie missing index requests to the queries that requested them. Previously, that took a whole lot of heroic effort, or luck.

With this new DMV, it’s possible to combine queries that look for missing indexes with queries that look for tuning opportunities in the plan cache or in Query Store.

It seems to tie back to dm_db_missing_index_groups, on the index_group_handle column in this DMV joined to the group handle column in the new DMV.

If you’re wondering why I’m not giving you any code samples here, it’s because I’m going to get some stuff built into sp_BlitzIndex to take advantage of it, now that it’s documented.

Special thanks to William Assaf (b|t) for helping to get this done.

Thanks for reading!

Join Me At Data Platform Summit 2020!

The Road From Nowhere


This year, I’m teaching an 8 hour online workshop at Data Platform Summit, and I’d love it if you joined me.

Here’s what I’ll be teaching:

Class Title: The Beginner’s Guide To Advanced Performance Tuning

Abstract: You’re new to SQL Server, and your job more and more is to fix performance problems, but you don’t know where to start.

You’ve been looking at queries, and query plans, and puzzling over indexes for a year or two, but it’s still not making a lot of sense.

Beyond that, you’re not even sure how to measure if your changes are working or even the right thing to do.

In this full day performance tuning extravaganza, you’ll learn about all the most common anti-patterns in T-SQL querying and indexing, and how to spot them using execution plans. You’ll also leave knowing why they cause the problems that they do, and how you can solve them quickly and painlessly.

If you want to gain the knowledge and confidence to tune queries so they’ll never be slow again, this is the training you need.

Date: Dec 7 & 8.

Time: 12 PM to 04 PM EST (View in your timezone)

Tickets: Tickets here!

Why Microsoft Should Get Rid Of The Plan Cache And Use Query Store Instead

Used To Love Her


I used to think the plan cache was so cool.

  • You can find queries that aren’t good there
  • Plans are full of details (and XML)
  • Supporting DMVs give you extra insights about resource usage

But most of the time now, I’m totally frustrated with it.

It clears out a lot, plans aren’t there for some queries, and the plans that are there can be very misleading.

Can you really tell someone what their worst performing queries are when everything in there is from the last 15 minutes?

No.

Why I Like Query Store Better


Plans end up there for more definite amounts of time, even if you stick a recompile hint on them.

Plus, you get a whole lot more information about what went on with all the different query metrics. As far as a source of “what happened and when”, it’s much more reliable. It’s almost like having a monitoring tool in SQL Server.

Do you trust your plan cache to tell you what was slow more than a couple days ago?

One really big bonus is that it’s portable, and it’s per-database. You can backup or clone a database so someone else can easily look at it somewhere else.

On top of that, it’s more usable: there are built in reports, a graphical interface, settings you can tweak, and plans you can force with a couple clicks.

You could solve a lot of frustrations by moving things here.

Microsoft Has Work To Do, Though


There have been a number of issues, and unanswered user requests for Query Store over the years, and not all of the associated DMVs have full parity.

Querying the data also needs to be made less prone to performance issues. And, living in the future, it’s a much more sensible location for the last known actual plan.

A nice to have would be predicate normalization for identical and morally equivalent queries, with totals and averages for resource usage, so you could group things together. Especially for morally equivalent queries, this would allow plan forcing where the queries get different ids.

Some monitoring tools have this in their interface, and the number of times I’ve been able to track down outliers within a group of executions to solve a problem is, well, at least memorable to me.

I wish more people had monitoring tools.

Thanks for reading!

What’s The Point Of Estimated Plans, Anyway?

Further From Truth


Estimated plans can be a bit confounding, huh?

Whether you see one by hitting CTRL+L, looking in the plan cache, or looking in Query Store, the song remains the same.

The query is slow: sometimes? always? usually? rarely?

But where? But why?

Estimated plans can hide a lot of work from you. They’re sort of like government contracts, where the optimizer is the government: it picks the cheapest plan, but then…

(Okay, maybe it picks a plan because a friend or relative or big donor was behind it. Sometimes it seems that way, with the nonsense you end up with.)

Of course once execution is under way, you run into all sorts of unexpected costs: Everything is much more expensive, and takes a whole lot longer than projected.

Glad Ways


From a practical perspective, estimated plans can be useful to get a general idea of the plan shape that the optimizer is coming up with.

Despite all the lying estimated plans can do, you may be able to spot some obvious anti-patterns.

Of more utility is when the query is particularly long running, or when the query will modify underlying data.

When I’m tuning modification queries, there’s usually some pain involved when you need to reset things.

For example:

  • Most people don’t want to wait 5-10 minutes just to see if a plan changed after making changes
  • Deleting data that doesn’t exist
  • Updates that don’t change anything
  • Not being able to insert duplicate data a second time

…Are all hindrances to getting things done.

And so you need to restore or roll things back another way (and boy howdy has SQL Server 2019 spoiled me for rollbacks).

Thanks for reading!

Whose Plan Handle Is It Anyway?

End Of The Line


There are a lot of words you can use to describe Extended Events. I’ll skip over my list to keep the blog family friendly.

In this post, we’re going to look at how two event can show the same query having a different plan handle.

I stumbled on this while working on sp_HumanEvents, when one of my precious sessions suddenly STOPPED RETURNING RESULTS.

Very helpful.

Let’s look at why.

A Plan Here, A Plan There


When the optimizer gets its hands on a query, it has some early choices, where it might choose to keep a plan at the trivial optimization level, and it may choose simple parameterization in order to make things more re-usable.

Apparently this causes some differences when it comes to hashing plans.

Here’s an example query:

SELECT COUNT(*) FROM dbo.Votes WHERE BountyAmount = 500 AND UserId = 22565;

Looking at the execution plan for it, we know a couple things.

Goodbye to you.

It didn’t stay trivial because it went parallel, and the optimizer thought that simple parameterization was a good idea. Even though we passed in literals, they’re replaced by parameters in the execution plan text.

What’s the big deal?

Well, nothing, unless you want to correlate information about a single query from multiple Extended Events.

Check It Out


Let’s run sp_HumanEvents and get debug information — because the regular results won’t return anything for this query right now (I have fixed this issue, don’t worry).

EXEC sp_HumanEvents @event_type = N'query', @debug = 1, @seconds_sample = 5;

In another window, we’ll run our query.

SELECT COUNT(*) FROM dbo.Votes WHERE BountyAmount = 500 AND UserId = 22565;

What do we get back? The debug parameter will spit out data in temp tables, and print out dynamic SQL commands for troubleshooting.

Two Events, One Query

Eagle-eyed readers will notice that the plan handles in each event for the same query are different, but query hash and plan hash are identical.

Because of this, I’ve had to change the way I get plan handles, so I only take the one that comes from post execution showplan, because…

There’s Another Way


Which plan_handle ends up in your plan cache? Not the one from sql_statement_completed.

We can see that by running sp_HumanEvents to capture wait stats, which will also try to correlate waits to the queries that caused them.

EXEC sp_HumanEvents @event_type = N'waits', @debug = 1, @seconds_sample = 5;

I’m grabbing the debug here too so we can see what ends up in the temp tables along the way.

COOL YEAH

We can grab information about which queries caused waits, and the plan handle here is the same one that we saw in the post execution showplan event.

Yay?

In the Actual Results Plus from sp_HumanEvents, we can see the simple parameterized version of the query ended up in the plan cache.

Goodie Bagg

There are two lines here because we had waits on two different wait stats, but the point is more that we can see simple parameterization at work here, but not in the events that collected information about executed queries.

Frustration Point


The problem would have been more obvious if the parameterized text showed up in the post execution showplan event, and not the version with literal values. It would have been easier to figure out why the plan handles were different, anyway.

So what’s the solution? Now, sp_HumanEvents will only get the plan handle from the post execution showplan event, in case you want to go and do anything with it later.

Thanks for reading!

Is RECOMPILE Really That Bad?

To Re Or Not To Re


It’s been a while since SQL Server has had a real RECOMPILE problem. And if you put it up against the performance problems that you can hit with parameter sniffing, I’d have a hard time telling someone strapped for time and knowledge that it’s the worst idea for them.

Obviously, you can run into problems if you (“you” includes Entity Framework, AKA the Database Demolisher) author the kind of queries that take a very long time to compile. But as I list them out, I’m kinda shrugging.

Here are some problems you can hit with recompile. Not necessarily caused by recompile, but by not re-using plans.

  • Long compile times: Admittedly pretty rare, and plan guides or forced plans are likely a better option.
  • No plan history in the cache (only the most recent plan): Sucks if you’re looking at the plan cache. Sucks less if you have a monitoring tool or Query Store.
  • CPU spikes for high-frequency execution queries: Maybe time for caching some stuff, or getting away from the kind of code that executes like this (scalar functions, cursors, etc.)

But for everything in the middle: a little RECOMPILE probably won’t hurt that bad.

Thinking of the problems it can solve:

  • Parameter sniffing
  • Parameter embedding (lack of)
  • Local variable estimates
  • Catch all queries

Those are very real problems that I see on client systems pretty frequently. And yeah, sometimes there’s a good tuning option for these, like changing or adding an index, moving parts of the query around, sticking part of the query in a temp table, etc.

But all that assumes that those options are immediately available. For third party vendors who have somehow developed software that uses SQL Server for decades without running into a single best practice even by accident, it’s often harder to get those changes through.

There’s More Than One Way To Recompile


Sure, you might be able to sneak a recompile hint somewhere in the mix even if it’d make the vendor upset. You can always yoink it out later.

But you have alternatives, too.

  • DBCC FREEPROCCACHE: No, not the whole cache. You can single out troublesome queries to remove specific plans.
  • Plan Guides: An often overlooked detail of plan guides is that you can attach hints to them, including recompile.

Using a plan guide doesn’t interfere with that precious vendor IP that makes SQL Server unresponsive every 15 minutes. Or whatever. I’m not mad.

And yeah, there’s advances in SQL Server 2017 and 2019 that start to address some issues here, but they’re still imperfect.

I like’em, but you know. They’re not quite there yet.

Thanks for reading!

Are Long Compile Times Bringing You Down?

Rarities


I ran into a very funny situation while working with a client recently. They were using Entity Framework, and a query with around 10 left joins ended up with a compile time of nearly 70 seconds.

Relatively.

Once the query finished compiling, it ran instantly with a pretty simple plan with all nested loops joins.

So what happened?

Tracing Flags


For science, I broke out a Rocket Science Trace Flag™ that will show optimization phases and how long were spent in them.

What came back looked like this:

end exploration, tasks: 3098 Cost = 242.433 time: 0 net: 0 total: 0.049 net: 0.047
end exploration, tasks: 3099 Cost = 242.433 time: 0 net: 0 total: 0.049 net: 0.047
end exploration, tasks: 3104 Cost = 242.433 time: 0 net: 0 total: 0.049 net: 0.048
end exploration, tasks: 3331 Cost = 242.433 time: 0.002 net: 0.002 total: 0.052 net: 0.05
end exploration, tasks: 3332 Cost = 242.433 time: 0 net: 0 total: 0.052 net: 0.05
end search(1),  cost: 210.273 tasks: 3332 time: 0 net: 0 total: 0.052 net: 0.05
*** Optimizer time out abort at task 211100 ***
end search(2),  cost: 210.273 tasks: 211100 time: 69.214 net: 69.678 total: 69.267 net: 69.729
*** Optimizer time out abort at task 211100 ***
End of post optimization rewrite, time: 0.001 net: 0.001 total: 69.268 net: 69.73
End of query plan compilation, time: 0.002 net: 0.002 total: 69.271 net: 69.732

The numbers aren’t quite the same, since the plan is from a different run than when I captured the trace flag (8675) output.

But you can see pretty clearly, in Search 2, we hung out for a while trying different rewrites.

What happens during Search 2? The whole enchilada.

In this case? Probably mostly join reordering.

Tracking Lags


If you don’t have query store enabled, it’s possible to search the plan cache, or get a warning from BlitzCache for long compile times.

If you do have Query Store enabled, compile time is logged in a couple places:

SELECT TOP (10)
       qsq.query_id,
       qsq.query_text_id,
       qsq.initial_compile_start_time,
       qsq.last_compile_start_time,
       qsq.last_execution_time,
       qsq.count_compiles,
	   qsq.last_compile_duration / 1000000. last_compile_duration,
       qsq.avg_compile_duration / 1000000. avg_compile_duration,
       qsq.avg_bind_duration / 1000000. avg_bind_duration,
       qsq.avg_bind_cpu_time / 1000000. avg_bind_cpu_time,
       qsq.avg_optimize_duration / 1000000. avg_optimize_duration,
       qsq.avg_optimize_cpu_time / 1000000. avg_optimize_cpu_time,
       qsq.avg_compile_memory_kb / 1024. avg_compile_memory_mb,
       qsq.max_compile_memory_kb / 1024. max_compile_memory_mb
--INTO   #query_store_query
FROM sys.query_store_query AS qsq
WHERE qsq.is_internal_query = 0
AND qsq.avg_compile_duration >= 1000000. --This is one second in microseconds
ORDER BY avg_compile_duration DESC

SELECT TOP (10)
       qsp.plan_id,
       qsp.query_id,
       qsp.engine_version,
       qsp.count_compiles,
       qsp.initial_compile_start_time,
       qsp.last_compile_start_time,
       qsp.last_execution_time,
       qsp.avg_compile_duration / 1000000. avg_compile_duration,
       qsp.last_compile_duration / 1000000. last_compile_duration,
       CONVERT(XML, qsp.query_plan) query_plan
--INTO   #query_store_plan
FROM sys.query_store_plan AS qsp
WHERE  qsp.avg_compile_duration >= 1000000. --This is one second in microseconds
ORDER BY  qsp.avg_compile_duration DESC

I’ve seen different numbers show up in these, so I like to look at both. I don’t know why that happens. There’s probably a reasonable explanation.

If you wanted to add in some other metrics, you could do this:

DROP TABLE IF EXISTS #query_store_query;
DROP TABLE IF EXISTS #query_store_plan;

SELECT TOP (10)
       qsq.query_id,
       qsq.query_text_id,
       qsq.initial_compile_start_time,
       qsq.last_compile_start_time,
       qsq.last_execution_time,
       qsq.count_compiles,
	   qsq.last_compile_duration / 1000000. last_compile_duration,
       qsq.avg_compile_duration / 1000000. avg_compile_duration,
       qsq.avg_bind_duration / 1000000. avg_bind_duration,
       qsq.avg_bind_cpu_time / 1000000. avg_bind_cpu_time,
       qsq.avg_optimize_duration / 1000000. avg_optimize_duration,
       qsq.avg_optimize_cpu_time / 1000000. avg_optimize_cpu_time,
       qsq.avg_compile_memory_kb / 1024. avg_compile_memory_mb,
       qsq.max_compile_memory_kb / 1024. max_compile_memory_mb
INTO   #query_store_query
FROM sys.query_store_query AS qsq
WHERE qsq.is_internal_query = 0
AND qsq.avg_compile_duration >= 1000000. --This is one second in microseconds
ORDER BY avg_compile_duration DESC;

SELECT TOP (10)
       qsp.plan_id,
       qsp.query_id,
       qsp.engine_version,
       qsp.count_compiles,
       qsp.initial_compile_start_time,
       qsp.last_compile_start_time,
       qsp.last_execution_time,
       qsp.avg_compile_duration / 1000000. avg_compile_duration,
       qsp.last_compile_duration / 1000000. last_compile_duration,
       CONVERT(XML, qsp.query_plan) query_plan
INTO   #query_store_plan
FROM sys.query_store_plan AS qsp
WHERE  qsp.avg_compile_duration >= 1000000. --This is one second in microseconds
ORDER BY  qsp.avg_compile_duration DESC;

SELECT (avg_cpu_time - qsq.avg_compile_duration) AS cpu_time_minus_qsq_compile_time,
       (avg_cpu_time - qsp.avg_compile_duration) AS cpu_time_minus_qsp_compile_time,
       qsrs.avg_cpu_time,
       qsrs.avg_duration,
       qsq.avg_compile_duration,
       qsq.avg_bind_duration,
       qsq.avg_bind_cpu_time,
       qsq.avg_optimize_duration,
       qsq.avg_optimize_cpu_time,
       qsq.avg_compile_memory_mb,
       qsp.avg_compile_duration,
       qsq.count_compiles,
       qsrs.count_executions,
       qsp.engine_version,
       qsp.query_id,
	   qsp.plan_id,
	   CONVERT(XML, qsp.query_plan) query_plan,
       qsqt.query_sql_text,
       qsrs.first_execution_time,
       qsrs.last_execution_time,
       qsq.initial_compile_start_time,
       qsq.last_compile_start_time,
       qsq.last_execution_time
FROM #query_store_query AS qsq
JOIN #query_store_plan AS qsp
ON qsq.query_id = qsp.query_id
JOIN sys.query_store_query_text AS qsqt
ON qsqt.query_text_id = qsq.query_text_id
JOIN 
(
SELECT qsrs.plan_id,
       qsrs.first_execution_time,
       qsrs.last_execution_time,
       qsrs.count_executions,
       qsrs.avg_duration / 1000000. avg_duration,
       qsrs.avg_cpu_time / 1000000. avg_cpu_time
FROM sys.query_store_runtime_stats AS qsrs
) AS qsrs
ON qsrs.plan_id = qsp.plan_id
ORDER BY qsq.avg_compile_duration DESC;
--ORDER BY qsp.avg_compile_duration DESC;

Fixes?


For EF, the only solution was to use a plan guide with a FORCE ORDER hint supplied. This let us arm wrestle the optimizer into just joining the tables in the order that the joins are written in the query. For some reason, forcing the plan with query store did not force the plan that forced the order.

I didn’t dig much into why. I do not get along with query store most of the time.

If you’re finding this happen with queries you have control over, doing your own rewrites to simplify the query and reduce the number of joins that the optimizer has to consider can help.

Thanks for reading!

When Parameterizing Queries Won’t Help

Multitude


There are many good reasons to parameterize queries.

There are, of course, downsides, too. Parameter sensitivity, AKA parameter sniffing, being the prime one.

But let’s say you consult the internet, or find a consultant on the internet, and they tell you that you ought to parameterize your queries.

It all sounds like a grand idea — you’ll get better plan reuse, and hopefully the plan cache will stop clearing itself out like a drunken ourobouros.

You could even use a setting called forced parameterization, which doesn’t always work.

Apart from the normal rules about when parameteriztion, forced or not, may not work, there’s another situation that can make things difficult.

Client Per Thing


Let’s assume for a second that you have a client-per-database, or client-per-schema model.

If I execute parameterized code like this:

DECLARE @i INT = 2
DECLARE @sql NVARCHAR(MAX) = N'
SELECT COUNT(*) AS records
FROM dbo.Users AS u
WHERE u.Reputation = @ii
'

EXEC sys.sp_executesql @sql, N'@ii INT', @ii = @i;

But from different database contexts (I have a few different versions of StackOverflow on my server, but I’m going to show results from 2013 and 2010), we’ll get separate cached plans, despite them having identical:

  • Costs
  • Query Plans
  • SQL Handles
  • Query Hashes
Frida Fredo

The same thing would happen with any parameterized code executed in a different context — stored procedures, functions… well. You get the idea.

Forced parameterization may help queries within the same context with plan reuse, but there are certain boundaries they won’t cross.

Don’t get me wrong, here. I’m not complaining. There’s so much that could be different, I wouldn’t want plan reuse across these boundaries. Heck, I may even separate stuff specifically to get different plans. As usual, I don’t want you, dear reader, to be surprised by this behavior.

Thanks for reading!