If You Remember Part 1
We looked at a couple examples of when SQL Server might need to filter out rows later in the plan than we’d like, and why that can cause performance issues.
Now it’s time to look at a few more examples, because a lot of people find them surprising.
As much as I love surprising people, sometimes I’d much rather… not have to explain this stuff later.
Since all my showering and errands are out of the way, we should be able to get through this list uninterrupted.
Unless I get thirsty.
If we need to search a column that has a MAX datatype, or if we define a parameter as being a MAX datatype and search a more sanely typed column with it, both will result in a later filter operation than we may care for.
SELECT COUNT_BIG(*) AS records FROM dbo.Users AS u WHERE u.AboutMe = N'Hi'; DECLARE @Location nvarchar(MAX) = N'here'; SELECT COUNT_BIG(*) AS records FROM dbo.Users AS u WHERE u.Location = @Location OPTION(RECOMPILE);
Even with a recompile hint!
Here we can see the value of properly defining string widths! If we don’t, we may end up reading entire indexes, and doing the work to weed out rows later.
Probably something that should be avoided.
There are some built-in functions, like DATALENGTH, which can’t be pushed when used in a where clause.
Of course, if you’re going to do this regularly, you should be using a computed column to get around the issue, but whatever!
SELECT COUNT_BIG(*) AS records FROM dbo.Users AS u WHERE DATALENGTH(u.Location) > 0;
And of course, everyone’s favorite love-to-hate, the scalar UDF.
Funny thing about these, is that sometimes tiny bumps in the number of rows you’re after can make for big jumps in time.
SELECT TOP (165) u.Id, u.DisplayName FROM dbo.Users AS u WHERE dbo.ScalarFunction(u.Id) > 475 ORDER BY u.Id; SELECT TOP (175) u.Id, u.DisplayName FROM dbo.Users AS u WHERE dbo.ScalarFunction(u.Id) > 475 ORDER BY u.Id;
Sometimes people (and ORMs) will build up long parameter lists, and use them to build up a long list IN clause list, and even sometimes a long OR clause list.
To replicate that behavior, I’m using code I’m keeping on GitHub in order to keep this blog post a little shorter.
To illustrate where things can get weird, aside from the Filter, I’m going to run this with a few different numbers of parameters.
EXEC dbo.Longingly @loops = 15; EXEC dbo.Longingly @loops = 18; EXEC dbo.Longingly @loops = 19;
This will generate queries with different length IN clauses:
Which will result in slightly different query plans:
We can see some tipping points here.
- At 15 parameters, we get a scan with a stream aggregate
- At 18 parameters, we get a scan with a filter
- At 19 parameters, we get a parallel scan with a filter
Parallelism to the rescue, again, I suppose.
Thanks for reading!
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.