Why SQL Server’s OPTIMIZE FOR UNKNOWN Hint Hurts Performance

“Best Practice”


It’s somewhat strange to hear people carry on about best practices that are actually worst practices.

One worst practice that has strong staying power is the OPTIMIZE FOR UNKNOWN hint, which we talked about yesterday.

It probably doesn’t help that Microsoft has products (I’m looking at you, Dynamics) which have a setting to add the hint to every query. Shorter: If Microsoft recommends it, it must be good.

Thanks, Microsoft. Dummies.

Using the OPTIMIZE FOR UNKNOWN hint, or declaring variables inside of a code block to be used in a where clause have the same issue, though: they make SQL Server’s query optimizer make bad guesses, which often lead to bad execution plans.

You can read great detail about that here.

Mistakenly


We’re going to create two indexes on the Posts table:

CREATE INDEX
    p0
ON dbo.Posts
(
    OwnerUserId
)
WITH
(
    SORT_IN_TEMPDB = ON,
    DATA_COMPRESSION = PAGE
);
GO

CREATE INDEX
    p1
ON dbo.Posts
(
    ParentId,
    CreationDate,
    LastActivityDate
)
INCLUDE
(
    PostTypeId
)
WITH
(
    SORT_IN_TEMPDB = ON,
    DATA_COMPRESSION = PAGE
);
GO

The indexes themselves are not as important as how SQL Server goes about choosing them.

Support Wear


This stored procedure is going to call the same query in three different ways:

  • One with the OPTIMIZE FOR UNKNOWN hint that uses parameters
  • One with local variables set to parameter values with no hints
  • One that accepts parameters and uses no hints
CREATE OR ALTER PROCEDURE
    dbo.unknown_soldier
(
    @ParentId int,
    @OwnerUserId int
)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

    SELECT TOP (1)
        p.*
    FROM dbo.Posts AS p
    WHERE p.ParentId = @ParentId
    AND   p.OwnerUserId = @OwnerUserId
    ORDER BY
        p.Score DESC,
        p.Id DESC
    OPTION(OPTIMIZE FOR UNKNOWN);

    DECLARE
        @ParentIdInner int = @ParentId,
        @OwnerUserIdInner int = @OwnerUserId;

    SELECT TOP (1)
        p.*
    FROM dbo.Posts AS p
    WHERE p.ParentId = @ParentIdInner
    AND   p.OwnerUserId = @OwnerUserIdInner
    ORDER BY
        p.Score DESC,
        p.Id DESC;

    SELECT TOP (1)
        p.*
    FROM dbo.Posts AS p
    WHERE p.ParentId = @ParentId
    AND   p.OwnerUserId = @OwnerUserId
    ORDER BY
        p.Score DESC,
        p.Id DESC;

END;
GO

Placebo Effect


If we call the stored procedure with actual execution plans enabled, we get the following plans back.

EXEC dbo.unknown_soldier 
    @OwnerUserId = 22656, 
    @ParentId = 0;
SQL Server Query Plan With Optimize For Unknown Hint
Not a good guess.

The assumed selectivity that the OPTIMIZE FOR UNKNOWN hint produces as a cardinality estimate is way off the rails.

SQL Server thinks three rows are going to come back, but we get 6,050,820 rows back.

We get identical behavior from the second query that uses variables declared within the stored procedure, and set to the parameter values passed in.

SQL Server Query Plan With Local Variables
release me

Same poor guesses, same index choices, same long running plan.

Parameter Effect


The query that accepts parameters and doesn’t have any hints applied to it fares much better.

SQL Server Query Plan
transporter

In this case, we get an accurate cardinality estimate, and a more suitable index choice.

Note that both queries perform lookups, but this one performs far fewer of them because it uses an index that filters way more rows out prior to doing the lookup.

The optimizer is able to choose the correct index because it’s able to evaluate predicate values against the statistics histograms rather than using the assumed selectivity guess.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

SQL Server 2022 Parameter Sensitive Plan Optimization: A Missed Opportunity For PSP To Make A Query Go Faster

Farce


The index and proc creation script for this are a bit on the long side, so I’m going to stick them in a GitHub gist to keep the post concise, since we have some other ground to cover here.

I want to lead with the statistics object that gets used for the query, which has a single equality predicate on the parameter value to search the VoteTypeId column.

SQL Server Statistics Histogram

The relevant portion of the query is this:

FROM     dbo.Votes AS v
WHERE    v.VoteTypeId = @VoteTypeId

The histogram matches the row counts from the table precisely. Thanks, full scan!

So, what’s the problem?

Autoproblematic


To my eye, there’s sufficient skewness here to present parameter sensitivity issues.

SQL Server Query Results
skewdriver

But SQL Server disagrees!

SQL Server Extended Event
ouch dude

733 records vs. 37 million records seems appropriately skewed to me, but we get all of the typical parameter sensitivity symptoms.

Plansplosion


Let’s get ready to rumble, etc.

EXEC dbo.VoteSniffing 
    @VoteTypeId = 4;

EXEC dbo.VoteSniffing 
    @VoteTypeId = 2;

Here are the plans:

SQL Server Query Plan
dirty thirty

Take a moment here to admire the somewhat confusing plan timing that Batch Mode operations are presenting here: The bottom query runs for around 32 seconds.

Up to the Nested Loops join, we use ~17 seconds of wall clock time with our serial execution plan. The Hash Match operator runs for 15 seconds on its own, in Batch Mode.

Parameter sensitivity still happens in SQL Server 2022.

Reversi


If we run the procedure in reverse order, using 4 to cache the plan and then 2 to reuse, we get a similar regression:

SQL Server Query Plan
little things

Now, okay, let’s pause for a minute here. I need to backtrack, but I don’t want to spend a ton of blogspace on it. I’m also going to put the information in a very snooty block quote.

If we re-run the procedure from the section up above to search for VoteTypeId 2 a second time, Memory Grant Feedback will fix the spill at the Hash Join, and bring the total execution time down to about 15 seconds.

That is an improvement, but… Look at the plan here. If VoteTypeId 2 uses a plan more suited to the number of rows it has to process, the overall time is around 4 seconds, with no need for a memory grant correction.

The second plan for this execution sequence, searching for VoteTypeId 4 second in order, the overall time goes from 0 seconds and 0 milliseconds to 1.4 seconds. The big plan does not make searching for infrequent values faster.

So you see, the big plan isn’t always better.

Missed Connections


Hey, look, this is the first CTP. Maybe stuff like this is still getting ironed out. Maybe this blog post will change the course of history.

I am only moderately influential in the eyes of Microsoft, though, so perhaps not.

Anyway, this seems like a situation with sufficient skew to produce the Dispatcher plan and then additional sub-plans to account for far different row counts present in the VoteTypeId column.

If this scenario (and other similar scenarios) is outside the scope of the feature, query tuning folks are still going to have a whole lot of parameter sensitivity issues to deal with.

And should that be the case, I’d like to speak to the manager.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

What’s The Best Statistics Sample Rate For SQL Server?

Big Think


Most people fire off their index and statistics maintenance jobs with the default options and do just fine. Sometimes, though, statistics need a little bit more consideration, and I get a lot of questions about them.

  • Do I need to use a sampling rate other than the default?
  • Do I need to use FULLSCAN?
  • What if auto update stats ruins my good statistics?
  • Should I update statistics asynchronously?
  • Are there easy ways to tell if my statistics are causing problems?

There are a lot of local factors involved in answering these questions, and I can’t cover all of them in this post, but I can outline some general patterns I see when people have these concerns.

Is The Default Sampling Rate Good Enough?


Look — usually it is — I know you hate to hear it. You want to be that special edge case where it’s not. Moreover, it’s really efficient.

The tradeoff you make between higher sampling rates and better statistics and time it takes to update stats can be significant, especially on larger tables. And a lot of the time, those 201 histogram steps don’t end up a whole lot more informing than before.

I’m not saying never, but in particular the difference between 75% and 100% scans is kinda ?‍♂️

Do I Need A Full Scan?


If you’re at the point where you’re asking this question, you’re in one of two situations:

  • You’re just starting out with stats updates
  • You’re unhappy with the way your stats look now

If you’re just starting out, stop right here and just make sure you’re keeping them updated regularly.

If you’re unhappy with the way they look because of the way updates — manual or automatic — are going now, you might want to think about creating filtered statistics on the parts of the data that aren’t well-reflected by them.

Is Auto Update Stats Bad?


No, not usually. It can be quite helpful. Especially on supported versions of SQL Server where trace flag 2371 is the default behavior, it doesn’t take as many modifications for them to happen on large tables.

This can better help you avoid the ascending key problem that used to really hurt query plans, where off-histogram values got some really bad estimates. That’s somewhat addressed in higher compat levels, but I’ve still seen some wonkiness.

If you don’t want auto update stats to ruin your higher sampling rates, you probably wanna use the PERSIST_SAMPLE_PERCENT option, with your chosen value.

But again, you wanna be careful with this on big tables, and this is when you might also wanna use…

Asynchronous Stats Updates


This setting can be really useful for avoiding long running queries while waiting for updated stat histograms to be available. The downside is that you might hit some woof query plans until the background task does its thing and the plans get invalidated.

Make sure you’re solving the right problem by flipping this switch. SQL Server 2019 can help you figure out if this is something you’re hitting regularly, with the WAIT_ON_SYNC_STATISTICS_REFRESH wait type. Keep in mind this only shows up when stats are refreshed. Not when they’re created initially.

You can also see stuff like this crop up in monitoring tools and query store when queries take a while, but they’re doing stuff like SELECT StatMan.

Are Statistics My Problem?


I see a lot of people blaming statistics, when they’re really doing other stuff wrong.

There are many ways to write a query that are equivalent to the Calvin Peeing™ decal, with cardinality estimation being on the receiving end, like eschewing SARGability, or using local variables or table variables.

Another common issue is around cached plans and parameter sniffing. Those inaccurate guesses might have been really accurate for the compiled parameters, but not for the runtime parameters.

There are a lot of query hints that can help you figure out if statistics are the problem, or if there’s something in the cardinality estimation process that’s yucking things up.

  • ASSUME_JOIN_PREDICATE_DEPENDS_ON_FILTERS
  • ASSUME_MIN_SELECTIVITY_FOR_FILTER_ESTIMATES
  • ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS
  • FORCE_LEGACY_CARDINALITY_ESTIMATION
  • FORCE_DEFAULT_CARDINALITY_ESTIMATION

I use this pretty regularly to see if they end up giving me a better plan. They don’t always pan out, but it’s worth trying when what you get ain’t what you want.

Just remember that things like query complexity will have a lot to do with how accurate the estimates you get are. Chaining together a bunch of CTEs, derived tables, views, or 20 joins usually isn’t gonna end up well for you, or your query plan.

There are many times when I see people complaining that statistics are bad or out of date, and basic stuff like proper indexes aren’t there.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Trace Flag 3608 Disables Automatic Statistics Creation And Ghost Record Cleanup In SQL Server

Short Bad


According to the docs:

Prevents SQL Server from automatically starting and recovering any database except the master database. If activities that require TempDB are initiated, then model is recovered and TempDB is created. Other databases will be started and recovered when accessed. Some features, such as snapshot isolation and read committed snapshot, might not work. Use for Move System Databases and Move User Databases.

Note: Do not use during normal operation.

Scope: global only

But it turns out it can do quite a bit of harm. It seems that many things stop working when it’s in use, though, including statistics getting automatically created.

Here’s a repro script:

DROP TABLE IF EXISTS
    dbo.stats_test,
    dbo.stats_test_3608;


CREATE TABLE dbo.stats_test
(
    id int NOT NULL
);
GO 

INSERT
    dbo.stats_test
(
    id
)
VALUES
(
    0
);
GO

SELECT 
    st.* 
FROM dbo.stats_test AS st 
WHERE st.id = 1;
GO 

DBCC TRACEON(3608, -1);
GO 

CREATE TABLE dbo.stats_test_3608
(
    id int NOT NULL
);
GO 

INSERT
    dbo.stats_test_3608
(
    id
)
VALUES
(
    0
);
GO

SELECT 
    st.* 
FROM dbo.stats_test AS st 
WHERE st.id = 1;
GO 

DBCC TRACEOFF(3608, -1);
GO

And here are the results:

end times

End Times


This post comes from finding the trace flag enabled at a client site, and nearly every query plan having warnings about columns with missing statistics.

Sometimes this happens when statistics exist for a single column, but multi-column statistics don’t.

In this case, disabling the trace flag fixed things.

Ghost Busters


Even forcing ghost record cleanup won’t work with this trace flag on.

DBCC TRACEOFF(3608, -1);
GO 

INSERT 
    dbo.stats_test WITH(TABLOCK)
(
    id
)
SELECT
    m.message_id
FROM sys.messages AS m;
GO 

DELETE dbo.stats_test;
GO 

DBCC FORCEGHOSTCLEANUP;
GO 

SELECT
    record_count,
    ghost_record_count,
    version_ghost_record_count
FROM sys.dm_db_index_physical_stats
(
    DB_ID(), 
    OBJECT_ID('stats_test'), 
    0, 
    0, 
    'DETAILED'
);
GO 

DBCC TRACEON(3608, -1);
GO 

INSERT 
    dbo.stats_test_3608 WITH(TABLOCK)
(
    id
)
SELECT
    m.message_id
FROM sys.messages AS m;
GO 

DELETE dbo.stats_test_3608;
GO 

DBCC FORCEGHOSTCLEANUP;
GO 

SELECT
    record_count,
    ghost_record_count,
    version_ghost_record_count
FROM sys.dm_db_index_physical_stats
(
    DB_ID(), 
    OBJECT_ID('stats_test_3608'), 
    0, 
    0, 
    'DETAILED'
);
GO 

DBCC TRACEOFF(3608, -1);
GO
better off without you

Turn Off


So, uh, if you see this enabled anywhere, you should disable it.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Creating An Agent Job To Update Statistics Using Ola Hallengren’s Scripts

Hubba Hubba


Over in my GitHub repo, I’ve added a file that will create an Agent Job to update statistics using Ola Hallengren’s IndexOptimize script.

It’s something I hand out enough that I figured people might be interested in it. Currently, it’s not a default offering from Ola, it’s uh… custom code.

There are lots of people who should be using this, too.

  • Everyone

Because index maintenance scripts don’t measure a generally useful KPI, and one of the main benefits of index rebuilds is the statistics update.

Mindful


Some thing to keep in mind here:

  • You need to be using a relatively new version of Ola’s scripts
  • This script utilizes the @StatisticsModificationLevel parameter, added 2018-06-16
  • That parameter is currently set to 5, and you may need to change that depending on your environement

There are some things you’ll need to change in the script, if you’re doing anything really custom:

  • It targets the master database
  • It’s owned by the sa account
  • It’s set to run at midnight
  • It has no failure emails or alerts set up

This is a very vanilla starting place. It’s up to you to make it yours.

To report any issues with Ola’s scripts, head over to this GitHub repo.

To get the Agent Job script, head over to my GitHub repo

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

One Thing The “New” Cardinality Estimator Does Better In SQL Server

Or “Default”, If That’s Your Kink


Look, I’m not saying there’s only one thing that the “Default” cardinality estimator does better than the “Legacy” cardinality estimator. All I’m saying is that this is one thing that I think it does better.

What’s that one thing? Ascending keys. In particular, when queries search for values that haven’t quite made it to the histogram yet because a stats update hasn’t occurred since they landed in the mix.

I know what you’re thinking, too! On older versions of SQL Server, I’ve got trace flag 2371, and on 2016+ that became the default behavior.

Sure it did — only if you’re using compat level 130 or better — which a lot of people aren’t because of all the other strings attached.

And that’s before you go and get 2389 and 2390 involved, too. Unless you’re on compatibility level 120 or higher! Then you need 4139.

Arduous


Anyway, look, it’s all documented.

2371 Changes the fixed update statistics threshold to a linear update statistics threshold. For more information, see this AUTO_UPDATE_STATISTICS Option.

Note: Starting with SQL Server 2016 (13.x) and under the database compatibility level 130 or above, this behavior is controlled by the engine and trace flag 2371 has no effect.

Scope: global only

2389 Enable automatically generated quick statistics for ascending keys (histogram amendment). If trace flag 2389 is set, and a leading statistics column is marked as ascending, then the histogram used to estimate cardinality will be adjusted at query compile time.

Note: Please ensure that you thoroughly test this option, before rolling it into a production environment.

Note: This trace flag does not apply to CE version 120 or above. Use trace flag 4139 instead.

Scope: global or session or query (QUERYTRACEON)

2390 Enable automatically generated quick statistics for ascending or unknown keys (histogram amendment). If trace flag 2390 is set, and a leading statistics column is marked as ascending or unknown, then the histogram used to estimate cardinality will be adjusted at query compile time. For more information, see this Microsoft Support article.

Note: Please ensure that you thoroughly test this option, before rolling it into a production environment.

Note: This trace flag does not apply to CE version 120 or above. Use trace flag 4139 instead.

Scope: global or session or query (QUERYTRACEON)

4139 Enable automatically generated quick statistics (histogram amendment) regardless of key column status. If trace flag 4139 is set, regardless of the leading statistics column status (ascending, descending, or stationary), the histogram used to estimate cardinality will be adjusted at query compile time. For more information, see this Microsoft Support article.

Starting with SQL Server 2016 (13.x) SP1, to accomplish this at the query level, add the USE HINT ‘ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS’ query hint instead of using this trace flag.

Note: Please ensure that you thoroughly test this option, before rolling it into a production environment.

Note: This trace flag does not apply to CE version 70. Use trace flags 2389 and 2390 instead.

Scope: global or session or query (QUERYTRACEON)

I uh. I guess. ?

Why Not Just Get Cardinality Estimation Right The First Time?


Great question! Hopefully someone knows the answer. In the meantime, let’s look at what I think this new-fangled cardinality estimator does better.

The first thing we need is an index with literally any sort of statistics.

CREATE INDEX v ON dbo.Votes_Beater(PostId);

Next is a query to help us figure out how many rows we can modify before an auto stats update will kick in, specifically for this index, though it’s left as an exercise to the reader to determine which one they’ve got in effect.

There are a lot of possible places this can kick in. Trace Flags, database settings, query hints, and more.

SELECT TOP (1)
    OBJECT_NAME(s.object_id) AS table_name,
    s.name AS stats_name,
    p.modification_counter,
    p.rows,
    CONVERT(bigint, SQRT(1000 * p.rows)) AS [new_auto_stats_threshold],
    ((p.rows * 20) / 100) + CASE WHEN p.rows > 499 THEN 500 ELSE 0 END AS [old_auto_stats_threshold]
FROM sys.stats AS s
CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) AS p
WHERE s.name = 'v'
ORDER BY p.modification_counter DESC;

Edge cases aside, those calculations should get you Mostly Accurate™ numbers.

We’re going to need those for what we do next.

Mods Mods Mods


This script will allow us to delete and re-insert a bunch of rows back into a table, without messing up identity values.

--Create a temp table to hold rows we're deleting
DROP TABLE IF EXISTS #Votes;
CREATE TABLE #Votes (Id int, PostId int, UserId int, BountyAmount int, VoteTypeId int, CreationDate datetime);

--Get the current high PostId, for sanity checking
SELECT MAX(vb.PostId) AS BeforeDeleteTopPostId FROM dbo.Votes_Beater AS vb;

--Delete only as many rows as we can to not trigger auto-stats
WITH v AS 
(
    SELECT TOP (229562 - 1) vb.*
    FROM dbo.Votes_Beater AS vb
    ORDER BY vb.PostId DESC
)
DELETE v
--Output deleted rows into a temp table
OUTPUT Deleted.Id, Deleted.PostId, Deleted.UserId, 
       Deleted.BountyAmount, Deleted.VoteTypeId, Deleted.CreationDate
INTO #Votes;

--Get the current max PostId, for safe keeping
SELECT MAX(vb.PostId) AS AfterDeleteTopPostId FROM dbo.Votes_Beater AS vb;

--Update stats here, so we don't trigger auto stats when we re-insert
UPDATE STATISTICS dbo.Votes_Beater;

--Put all the deleted rows back into the rable
SET IDENTITY_INSERT dbo.Votes_Beater ON;

INSERT dbo.Votes_Beater WITH(TABLOCK)
        (Id, PostId, UserId, BountyAmount, VoteTypeId, CreationDate)
SELECT v.Id, v.PostId, v.UserId, v.BountyAmount, v.VoteTypeId, v.CreationDate
FROM #Votes AS v;

SET IDENTITY_INSERT dbo.Votes_Beater OFF;

--Make sure this matches with the one before the delete
SELECT MAX(vb.PostId) AS AfterInsertTopPostId FROM dbo.Votes_Beater AS vb;

What we’re left with is a statistics object that’ll be just shy of auto-updating:

WE DID IT

Query Time


Let’s look at how the optimizer treats queries that touch values! That’ll be fun, eh?

--Inequality, default CE
SELECT
    COUNT_BIG(*) AS records
FROM dbo.Votes_Beater AS vb
WHERE vb.PostId > 20671101
OPTION(RECOMPILE);

--Inequality, legacy CE
SELECT
    COUNT_BIG(*) AS records
FROM dbo.Votes_Beater AS vb
WHERE vb.PostId > 20671101
OPTION(RECOMPILE, USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

--Equality, default CE
SELECT
    COUNT_BIG(*) AS records
FROM dbo.Votes_Beater AS vb
WHERE vb.PostId = 20671101
OPTION(RECOMPILE);

--Equality, legacy CE
SELECT
    COUNT_BIG(*) AS records
FROM dbo.Votes_Beater AS vb
WHERE vb.PostId = 20671101
OPTION(RECOMPILE, USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

For the record, > and >= produced the same guesses. Less than wouldn’t make sense here, since it’d hit mostly all values currently in the histogram.

SQL Server Query Plan
hoodsy

Inside Intel


For the legacy CE, there’s not much of an estimate. You get a stock guess of 1 row, no matter what.

For the default CE, there’s a little more to it.

inequality
SELECT (0.00130115 * 5.29287e+07) AS inequality_computation;

 

equality
SELECT (1.06162e-06 * 5.29287e+07) AS equality_computation;

And of course, the CARD for both is the number of rows in the table:

SELECT CONVERT(bigint, 5.29287e+07) AS table_rows;

I’m not sure why the scientific notation is preferred, here.

A Little Strange


Adding in the USE HINT mentioned earlier in the post (USE HINT ('ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS')) only seems to help with estimation for the inequality predicate. The guess for the equality predicate remains the same.

SQL Server Query Plan
well okay

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Query Tuning SQL Server 2019 Part 5: I’m Not Going Back

Butt Out Bag


There was one thing that I didn’t talk about earlier in the week.

You see, there’s a mystery plan.

It only shows up once in a while, like Planet X. And when it does, we get bombarded by asteroids.

Just like when Planet X shows up.

I wouldn’t call it a good all-around plan, but it does something that we would want to happen when we run this proc for VoteTypeId 5.

Let’s go look!

The Optimizer Discovers Aggregates, Sort Of


This isn’t a good “general” plan. In fact, for any of the previously fast values, it sucks.

It sucks because just like the “optimize for unknown” plan, it has a bunch of startup costs, does a lot of scanning, and is generally a bad choice for VoteTypeIds that produce a small number of values.

SQL Server Query Plan
Ghost Town

Johnny Four


If you look carefully, you can see what the problem is.

For VoteTypeIds that filter out a lot of rows (which is most of them), that predicate doesn’t get applied until after Posts and Badges have been joined.

In other words, you fully join those tables, and then the result of that join is joined to the predicate-filtered result of Votes.

For this execution, the plan was compiled initially for VoteTypeId 2. It has 130 million entries in Votes. It’s the only VoteTypeId that produces this plan naturally.

The plan you’re looking at above was re-executed with VoteTypeId 4, which has… 8,190 rows in Votes.

I can’t stress enough how difficult it would be to figure out why this is bad just looking at estimated plans.

Though one clue would be the clustered index scan + predicate, if we knew that we had a suitable index.

SQL Server Query Plan Tool Tip
2legit

This kind of detail with row discrepancies only surfaces with actual plans.

But there is one thing here that wasn’t showing up in other plans, when we wanted it to: The optimizer decides to aggregate OwnerUserId coming from the Posts table prior to joining to Votes.

Johnny Five


If you recall the previously used plan, one complaint was that the result of joining Posts and Badges then joined to Votes had to probe 932 million rows.

You can sort of see that here, where the Adaptive Join prior to the highlighted Hash Match Aggregate produces >100 million rows. It’s more here because we don’t have Bitmaps against both Posts and Badges, but… We’re going off track a bit with that.

That could have been avoided if the optimizer had decided to aggregate OwnerUserId, like it does in this plan.

To compare:

SQL Server Query Plan
gag order

The top plan has a handy green square to show you a helpful pre-join aggregation.

The bottom plan has no handy green squares because there is no helpful pre-join aggregation.

The product of the aggregation is 3.2 million rows, which is exactly what we got as a distinct count when we began experimenting with temp tables:

SELECT COUNT_BIG(DISTINCT p.OwnerUserId) AS records --3,236,013 
FROM dbo.Posts AS p 
JOIN dbo.Badges AS b 
    ON b.UserId = p.OwnerUserId 
WHERE p.PostTypeId = 1;

Outhouse


If the optimizer had chosen to aggregate OwnerUserId prior to the join to Votes, we all could have gone home early on Friday and enjoyed the weekend

Funny, that.

Speaking of which, it’s Friday. Go enjoy the weekend.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Query Tuning SQL Server 2019 Part 4: Long Live The Query Tuner

Rumors Of My Demise


Let’s talk about some common hints that people use to fix parameter sniffing:

  • RECOMPILE: Won’t work here to get us a better plan for VoteTypeId 5, because it sucks when the optimizer knows what’s coming
  • OPTIMIZE FOR UNKNOWN: Works like once every 5 years, but people still bring it up, and really sucks here (picture below)
  • OPTIMIZE FOR (VALUE): Plan sharing doesn’t work great generally — if we were gonna do this, it’d have to be dynamic

This is what happens when we optimize for unknown. The density vector guess is 13,049,400.

SQL Server Query Plan
Stop it with this.

That guess for Vote Types with very few rows ends up with a plan that has very high startup costs.

This version of the query will run for 13-17 seconds for any given parameter. That sucks in zero gravity.

Pictured above is the plan for VoteTypeId 4, which previously finished sub-second using Plan 1 and Plan 2.

With those out of the way, how can we fix this thing?

The Mint


In some circumstances, a #temp table would help if we pre-staged rows from Votes.

The problem is that for many calls, we’d be putting between 7 and 130 MILLION rows into a temp table.

Not my idea of a good time.

SQL Server Query Results
RAMDISKS NINETY NINE CENTS

But what about the other part of the query?

If count up distinct OwnerUserIds, we get about 3.2 million.

Better, we can reduce the rows further in the procedure with an EXISTS to Votes (I’ll show you that in a minute).

SELECT COUNT_BIG(DISTINCT p.OwnerUserId) AS records --3,236,013
FROM dbo.Posts AS p
JOIN dbo.Badges AS b 
    ON b.UserId = p.OwnerUserId 
WHERE  p.PostTypeId = 1 

That’s not too bad, depending on:

  • How frequently it runs
  • How concurrently it runs
  • How overwhelmed tempdb is
  • Your Mom

The Product


That gives us:

CREATE OR ALTER PROCEDURE dbo.VoteSniffing ( @VoteTypeId INT )
AS
BEGIN
SET XACT_ABORT, NOCOUNT ON;

SELECT DISTINCT p.OwnerUserId
INTO #p
FROM dbo.Posts AS p
JOIN dbo.Badges AS b
    ON b.UserId = p.OwnerUserId
WHERE p.PostTypeId = 1
AND EXISTS
(
    SELECT 1/0
    FROM dbo.Votes AS v
    WHERE v.UserId = p.OwnerUserId
    AND   v.VoteTypeId = @VoteTypeId
);

SELECT   ISNULL(v.UserId, 0) AS UserId,
         SUM(CASE WHEN v.CreationDate >= '20190101'
                  AND  v.CreationDate < '20200101'
                  THEN 1
                  ELSE 0
             END) AS Votes2019,
         SUM(CASE WHEN v.BountyAmount IS NULL
                  THEN 0
                  ELSE 1
             END) AS TotalBounty,
         COUNT(DISTINCT v.PostId) AS PostCount,
         @VoteTypeId AS VoteTypeId
FROM     dbo.Votes AS v WITH(FORCESEEK)
WHERE    v.VoteTypeId = @VoteTypeId
AND      NOT EXISTS
        (   
            SELECT 1/0
            FROM #p AS p
            WHERE  p.OwnerUserId = v.UserId
        )
GROUP BY v.UserId;

END;
GO

Which works pretty well across all calls, and avoids the primary issue with VoteTypeId 5.

SQL Server Query Execution Times
Navy Blue

I’m generally happy with this, with the slight exception of VoteTypeId 8. Yeah, it beats the pants off of when we sniff Plan 2, but it’s about 7 seconds slower than when we get Plan 1.

I pulled the 17 minute execution from this graph for Plan 2/VoteTypeId 5, too, because it’s so distracting. Not having to worry about that thing is a trade off I’m willing to make for Plan 3 being about a second slower than Plan 1.

Not bad for a lazy Sunday afternoon of blogging, though.

Save One For Friday


Query tuning in SQL Server 2019 isn’t always a whole lot different from performance tuning other versions of SQL Server.

You have some more help from optimizer features (especially if you’re on Enterprise Edition), but they don’t solve every problem, and you can run into some very common problems that you’re already used to solving.

You may even be able to use some very familiar techniques to fix things.

In tomorrow’s post, I want to look at a quirk that would have thrown us way off course to explore on our way here.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Query Tuning SQL Server 2019 Part 3: Who Died And Made You The Optimizer?

Be Yourself


We’ve got a problem, Sam Houston. We’ve got a problem with a query that has some strange issues.

It’s not parameter sniffing, but it sure could feel like it.

  • When the procedure compiles and runs with VoteTypeId 5, it runs for 12 minutes
  • Other VoteTypeIds run well with the same plan that VoteTypeId 5 gets
  • When VoteTypeId 5 runs with a “small” plan, it does okay at 10 seconds

Allow me to ruin a graph to illustrate. The Y axis is still seconds, but… it goes up a little higher now.

SQL Server Query Execution Times
weigh-in

The Frustration (A Minor Digression)


Here’s where life can be tough when it comes to troubleshooting actual parameter sniffing.

If you’re relying solely on the plan cache, you’re screwed. You’ll see the plan, and the compile value, but you won’t have the runtime value anywhere that “caused” the problem. In other words, the set of parameters that were adversely affected by the query plan that didn’t fit.

There are some things that can help, like if you’re watching it happen live, or if you have a monitoring tool that might capture runtime parameters.

OR IF YOU USE SP UNDERSCORE HUMANEVENTS.

Like I said, this isn’t parameter sniffing, but it feels like it.

It could extra-feel like it because you might see a misbehaving query, and a compile-time parameter that runs quickly on its own when you test it, e.g. VoteTypeId 6.

It would be really hard to tell that even if a plan were to compile specifically for a different parameter, it would still run for 12 minutes.

Heck, that’d even catch me off-guard.

But that’s what we have here: VoteTypeId 5 gets a bad plan special for VoteTypeId 5.

Examiner


Let’s dig in on what’s happening to cause us such remarkable grief. There has to be a reason.

I don’t need more grief without reason; I’ve already got a public school education.

SQL Server Query Plan
I WANT TO KNOW

If we were to summarize the problem here: that Hash Match Left Anti Semi Join runs for 12 minutes on its own.

No other operator, or group of operators, is responsible for a significant amount of time comparatively.

Magnifier


Some things to note:

  • The bad estimates aren’t from predicates, they’re from Batch Mode Bitmaps
  • Those bad estimates end up producing a much larger number of rows from the Adaptive Join
  • The Hash Match ends up needing to probe 932 million rows

 

SQL Server Query Plan
el disastero

Taking 12 minutes to probe 932 million rows is probably to be expected, now that I think about it.

If the optimizer had a good estimate from the Bitmaps here, it may have done the opposite of what a certain Pacific Island Dwelling Bird said:

Getting every nuance of this sort of relational transformation correct can be tricky. It is very handy that the optimizer team put the effort in so we do not have to explore these tricky rewrites manually (e.g. by changing the query text). If nothing else, it would be extremely tedious to write all the different query forms out by hand just to see which one performed better in practice. Never mind choosing a different version depending on current statistics and the number of changes to the table.

In this case, the Aggregate happens after the join. If the estimate were correct, or even in the right spacetime dimension, this would be fine.

We can gauge the general efficiency of it by looking at when this plan is used for other parameters that produce numbers of rows that are closer to this estimate.

SQL Server Query Plan
huey

If the optimizer had made a good guess for this parameter, it may have changed the plan to put an aggregate before the join.

Unfortunately we have very little control over estimates for Bitmaps, and the guesses for Batch Mode Bitmaps are a Shrug of Atlassian proportions.

Finisher


We’ve learned some things:

  1. This isn’t parameter sniffing
  2. Batch Mode Bitmaps wear pants on their head
  3. Batch Mode Bitmaps set their head-pants on fire
  4. Most of the time Batch Mode performance covers this up
  5. The plan for VoteTypeId 5 is not a good plan for VoteTypeId 5
  6. The plan for VoteTypeId 5 is great for a lot of other VoteTypeIds

In tomorrow’s post, we’ll look at how we can fix the problem.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Query Tuning SQL Server 2019 Part 2: Big Databases, Big Ideas

Are We Still Friends?


When I first wrote this demo, I called it dbo.ParameterSniffingMonstrosity.

Because , you know, it’s really terrible.

CREATE OR ALTER PROCEDURE dbo.VoteSniffing( @VoteTypeId INT )
AS
SET XACT_ABORT, NOCOUNT ON;
    BEGIN
        SELECT   ISNULL(v.UserId, 0) AS UserId,
                 SUM(CASE WHEN v.CreationDate >= '20190101'
                          AND  v.CreationDate < '20200101'
                          THEN 1
                          ELSE 0
                     END) AS Votes2019,
                 SUM(CASE WHEN v.BountyAmount IS NULL
                          THEN 0
                          ELSE 1
                     END) AS TotalBounty,
                 COUNT(DISTINCT v.PostId) AS PostCount,
                 @VoteTypeId AS VoteTypeId
        FROM     dbo.Votes AS v
        WHERE    v.VoteTypeId = @VoteTypeId
        AND      NOT EXISTS
                (   
                    SELECT 1/0
                    FROM dbo.Posts AS p
                    JOIN dbo.Badges AS b 
                        ON b.UserId = p.OwnerUserId 
                    WHERE  p.OwnerUserId = v.UserId
                    AND    p.PostTypeId = 1 
                )
        GROUP BY v.UserId;
    END;
GO

The only parameter is for VoteTypeId, which has some pretty significant skew towards some types, especially in the full size Stack Overflow database.

SQL Server Query Results
Ask me about my commas

It’s like, when people tell you to index the most selective column first, well.

  • Sometimes it’s pretty selective.
  • Sometimes it’s not very selective

But this is exactly the type of data that causes parameter sniffing issues.

With almost any data set like this, you can draw a line or three, and values within each block can share a common plan pretty safely.

But crossing those lines, you run into issues where either little plans do far too much looping and seeking and sorting for “big” values, and big plans do far too much hashing and scanning and aggregating for “little” values.

This isn’t always the exact case, but generally speaking you’ll observe something along these lines.

It’s definitely not the case for what we’re going to be looking at this week.

This week is far more interesting.

That’s why it’s a monstrosity.

Fertilizer


The indexes that I create to support this procedure look like so — I’ve started using compression since at this point in time, 2016 SP1 is commonplace enough that even people on Standard Edition can use them — and they work quite well for the majority of values and query plans.

CREATE INDEX igno
ON dbo.Posts 
    (OwnerUserId, PostTypeId)
    WHERE PostTypeId = 1 
WITH(MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = ROW);
GO

CREATE INDEX rant
ON dbo.Votes 
    (VoteTypeId, UserId, PostId)
INCLUDE 
    (BountyAmount, CreationDate) 
WITH(MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = ROW);
GO 

CREATE INDEX clown ON dbo.Badges( UserId ) 
WITH(MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = ROW);
GO

If there are other indexes you’d like to test, you can do that locally.

What I want to point out is that for many values of VoteTypeId, the optimizer comes up with very good, very fast plans.

Good job, optimizer.

In fact, for any of these runs, you’ll get a good enough plan for any of the other values. They share well.

EXEC dbo.VoteSniffing @VoteTypeId = 4;
EXEC dbo.VoteSniffing @VoteTypeId = 6;
EXEC dbo.VoteSniffing @VoteTypeId = 7;
EXEC dbo.VoteSniffing @VoteTypeId = 9;
EXEC dbo.VoteSniffing @VoteTypeId = 11;
EXEC dbo.VoteSniffing @VoteTypeId = 12;
EXEC dbo.VoteSniffing @VoteTypeId = 13;
EXEC dbo.VoteSniffing @VoteTypeId = 14;
EXEC dbo.VoteSniffing @VoteTypeId = 15;
EXEC dbo.VoteSniffing @VoteTypeId = 16;

VoteTypeIds 1, 2, 3, 5, 8, and 10 have some quirks, but even they mostly do okay using one of these plans.

There are two plans you may see occur for these.

Plan 1

SQL Server Query Plan
teeny tiny

Plan 2

SQL Server Query Plan
it has adapted

Particulars & Peculiars


Plan 1 is first generated when the proc is compiled with VoteTypeId 4, and Plan 2 is first generated when the proc is compiled with VoteTypeId 6.

There’s a third plan that only gets generated when VoteTypeId 2 is compiled first, but we’ll have to save that for another post, because it’s totally different.

Here’s how each of those plans works across other possible parameters.

SQL Server Query Execution Times
this is my first graph

Plan 1 is grey, Plan 2 is blue. It’s pretty easy to see where each one is successful, and then not so much. Anything < 100ms got a 0.

The Y axis is runtime in seconds. A couple are quite bad. Most are decent to okay.

Plans for Type 2 & 8 obviously stick out, but for different plans.

This is one of those things I need to warn people about when they get wrapped up in:

  • Forcing a plan (e.g. via Query Store or a plan guide)
  • Optimizing for unknown
  • Optimizing for a specific value
  • Recompiling every time (that backfires in a couple cases here that I’m not covering right now)

One thing I need to point out is that Plan 2 doesn’t have an entry here for VoteTypeId 5. Why?

Because when it inherits the plan for VoteTypeId 6, it runs for 17 minutes.

SQL Server Query Plan
singalong

This is probably where you’re wondering “okay, so what plan does 5 get when it runs on its own? Is this the mysterious Plan 4 From Outer Space?”

Unfortunately, the plan that gets generated for VoteTypeId 5 is… the same one that gets generated for VoteTypeId 6, but 6 has a much smaller memory grant.

If you’re not used to reading operator times in execution plans, check out my video here.

Since this plan is all Batch Mode operators, each operator will track its time individually.

The Non-Switch


VoteTypeId 5 runtime, VoteTypeId 6 compile time

If I were to put a 17 minute runtime in the graph (>1000 seconds), it would defeat the purpose of graphing things.

Note the Hash Match has, by itself, 16 minutes and 44 seconds of runtime.

SQL Server Query Plan
pyramids

VoteTypeId 5 runtime, and compile time

This isn’t awesome, either.

The Hash Join, without spilling, has 12 minutes and 16 seconds of runtime.

SQL Server Query Plan
lost

Big Differentsiz


You have the same plan shape and operators. Even the Adaptive Join follows the same path to hash instead of loop.

Sure, the spills account for ~4 minutes of extra time. They are fairly big spills.

But the plan for VoteTypeId 5, even when compiled specifically for VoteTypeId 5… sucks, and sucks royally.

There are some dismally bad estimates, but where do they come from?

We just created these indexes, and data isn’t magically changing on my laptop.

TUNE IN TOMORROW!

Thanks for reading

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.