Why Index Tuning Is An Iterative Process

Here And There


For many people, index tuning means occasionally adding an index when there’s a report about a slow query. Those indexes might come from a query plan, or from the missing index DMVs, where SQL Server stores every complaint the optimizer files when it thinks an index might make a query better.

Sure, there are some people who think index tuning means rebuilding indexes or running DTA and checking all the boxes, but I ban those IP addresses.

Of course, there’s a whole lot more to index tuning. Adding indexes is fine to a point, but you really should spring clean those suckers once in a while, too.

Look for overlapping indexes, unused indexes, and check for any Heaps that may have snuck in there. sp_BlitzIndex is a pretty cool tool for that.

But even for adding indexes, sometimes it takes more than one pass, especially if you’re taking advice from query plans and DMVs.

How The What


Let’s say you’re looking at a server for the first time, or you’re not quite comfortable with designing your own indexes. No judgment, there.

You see a query plan for some piece of code that’s running slowly, and it has a missing index request.

Sugar Sugar Sugar

There’s only one missing index request — there’s not a bunch of hidden ones like in some plans — and it looks moderately helpful so you decide to try it.

Treefiddy

The thing is that as far as “stuff I want to go faster” in the plan, the clustered index scan on Posts is about 3x faster than the clustered index scan on Comments.

Deal with it

And the index that’s being asked for is only going to help us find PostTypeId = 1. It’s not going to help with the rest of or join or filtering very much.

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Posts] ([PostTypeId])
INCLUDE ([CreationDate],[OwnerUserId])
Una Posta

We still need to filter on CreationDate, and join on OwnerUserId later. Sometimes this index will be “good enough” and other times it “won’t”.

If PostTypeId were really selective, or if this query were searching for a particularly selective PostTypeId, then it’d probably be okay-ish.

But we’re not, so we may settle on this index instead.

CREATE NONCLUSTERED INDEX p
ON [dbo].[Posts] ([PostTypeId], [CreationDate], [OwnerUserId]);

With that in place, we only get marginal improvement in the timing of the plan. It’s about 1.5 seconds faster.

Probably not what we’d wanna report to end users.

Hella

But we have new green text! This time it’s for the Comments table, which is where our pain point lies time-wise.

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Comments] ([PostId],[CreationDate])

We add that, and reduce our query runtime to less than half of what it was originally.

Babewood

Is 2.6 seconds good? Or great? All depends on expectations.

Could we keep going and experimenting? Sure.

It all depends what we’re allowed to change, what our skill level is (mine is quite low, ho ho ho), and what our priorities are (these are also quite low).

This Is Just One Query


And since we had the luxury of having it in front of us, running it, adding an index, running it again to test the index, etc., we were able to spot the second index request that ended up helping even more than the first one.

If you don’t have that luxury, or if you just poke around the missing index DMVs every 3-6 months, you could miss stuff like this. Sure, that first request would be there, and it might look tempting enough for you to add, but the second one wouldn’t appear until after that. That’s the one that really helped.

Whenever you’re tuning indexes, or releasing code that’s going to use existing data in new ways, you’d be doing yourself a big favor to check in on this stuff at least weekly.

You might be an index tuning wiz and not need to — if you are, I’d be amazed if you made it this far into my blog post, though — or you may catch “obvious” new indexes during development.

But I’m going to tell you something about end users: they’re devious, mischievous, and they’re out to make you look bad.

As soon as they start using those new features of yours, they’re going to abuse them. They’re going to do all sorts of horrible things that you never would have dreamed of. And I’ll bet some different indexes would help you keep your good name.

Or at least your job.

Thanks for reading!

As a postscript to this: I don’t want you to think that missing index requests are the end-all be-all of indexing wisdom. There are lots of limitations, and suggested column order isn’t perfect. But if you’re just getting started, they’re a great way to start to understand indexing, and see the problems they do and don’t solve. And look, the only way to make them better would be to spend longer during compilation thinking about things. That’s not how the optimizer should be spending its time. We’re lucky to get these for free, and you should view them as a learning tool.

Your Resolution This Year: Get A Monitoring Tool That Helps

Yes We Have No Bananas


I speak with a lot of people who have SQL Server performance problems. Sometimes they’re constant, sometimes they’re obvious, and sometimes they’re intermittent.

Constant and obvious problems are easy to observe, though not necessarily easy to resolve without experience. Still, they’re a bit less frustrating, because at least you’ve got some idea what’s going on. It’s those intermittent problems that can be a real pain.

SQL Server is very catch-me-if-you-can when it comes to most issues. I’ve talked about this before I’m sure, but if you’re not hitting F5 at in the right place at the right time, you could totally miss what happened when you had a problem, and very there’s no time sliced data collection (except in query store) that can help with any reliability.

Not wait stats, not perfmon, not index usage. The plan cache can be okay, but is often unreliable, and has many blind spots.

Whatcha Got?


The people I talk to generally fall into two groups:

  • We don’t have a monitoring tool at all
  • We have one, but we can’t figure anything out with it

If you don’t have one, Sentry One is a good place to start. It’s not perfect, but I tend to have good luck with it.

If you do have one, make the most of it. Have a sales tech give you some training. That’s part of what you pay for. Take it a step further: tell them about the kind of problems you have, and ask them to show you how to find them using their monitoring tool. If they can’t do that, it might be time to shop around for one that can.

Old Block


This isn’t just for you, either. This request is sort of selfish.

Let’s say you call someone like me in to look at your server. If you’ve got intermittent issues, it’s a whole lot easier to show me what was happening in the monitoring tool the last time you had an issue than to try to explain to me what happened. It’s a lot less annoying than hoping it happens again while we’re looking at the server together, or trying to recreate it.

Thanks for reading!

Quickly Detecting CPU and Memory Pressure On A SQL Server

Options Ahoy


There are lots of great options out there for figuring out what’s happening on a server. If you’re the kind of person still running sp_who, or sp_who2, or sp_who8045, it’s time to stop.

You’ve got much better options, like sp_WhoIsActive, or the First Responder Kit. They’re both great, but sometimes I needed this very specific information very quickly without all the bells and whistles.

I’ve had to do some pretty weird troubleshooting in my life, where a server was totally jammed up, and these help me figure out what’s going on.

Sir-Procs-A-Lot


I call it the pressure detector. It’s four simple DMV queries that will give you different levels of detail about memory and CPU usage currently taking place on a server. Like a lot of these scripts, when a server is Baltimoring, you’re better off running them using the DAC. Otherwise, you’re kinda stuck in the same place all your other queries are.

So what does this help you with? Situations where you’re running out of worker threads, or when you’re running out of memory.

When you hit those, you’ll get smacked with a couple nasty waits:

  • THREADPOOL: Running out of worker threads
  • RESOURCE_SEMAPHORE: Running out of query memory

The thing is, sometimes it’s hard to see what’s happening, what’s causing problems, and what’s on the receiving end.

Check out this video to walk you through the proc results.

Video Killed


Thanks for reading!

Why Don’t I Have Any Missing Index Requests?

This was originally posted by me as an answer here. I’m re-posting it locally for posterity.

There are many reasons why you may not have missing index requests!


We’ll look at a few of the reasons in more detail, and also talk about some of the general limitations of the feature.

General Limitations


First, from: Limitations of the Missing Indexes Feature:

  • It does not specify an order for columns to be used in an index.

As noted in this Q&A: How does SQL Server determine key column order in missing index requests?, the order of columns in the index definition is dictated by Equality vs Inequality predicate, and then column ordinal position in the table.

There are no guesses at selectivity, and there may be a better order available. It’s your job to figure that out.

Special Indexes

Missing index requests also don’t cover ‘special’ indexes, like:

  • Clustered
  • Filtered
  • Partitioned
  • Compressed
  • XML-ed
  • Spatial-ed
  • Columnstore-d
  • Indexed View-ed

What columns are considered?


Missing Index key columns are generated from columns used to filter results, like those in:

  • JOINs
  • WHERE clause

Missing Index Included columns are generated from columns required by the query, like those in:

  • SELECT
  • GROUP BY
  • ORDER BY

Even though quite often, columns you’re ordering by or grouping by can be beneficial as key columns. This goes back to one of the Limitations:

  • It is not intended to fine tune an indexing configuration.

For example, this query will not register a missing index request, even though adding an index on LastAccessDate would prevent the need to Sort (and spill to disk).

SELECT TOP (1000) u.DisplayName FROM dbo.Users AS u ORDER BY u.LastAccessDate DESC;

 

NUTS

Nor does this grouping query on Location.

SELECT TOP (20000) u.Location FROM dbo.Users AS u GROUP BY u.Location

 

NUTS

That doesn’t sound very helpful!


Well, yeah, but it’s better than nothing. Think of missing index requests like a crying baby. You know there’s a problem, but it’s up to you as an adult to figure out what that problem is.

You still haven’t told me why I don’t have them, though…


Relax, bucko. We’re getting there.

Trace Flags


If you enable TF 2330, missing index requests won’t be logged. To find out if you have this enabled, run this:

DBCC TRACESTATUS;

Index Rebuilds


Rebuilding indexes will clear missing index requests. So before you go Hi-Ho-Silver-Away rebuilding every index the second an iota of fragmentation sneaks in, think about the information you’re clearing out every time you do that.

You may also want to think about Why Defragmenting Your Indexes Isn’t Helping, anyway. Unless you’re using Columnstore.

Adding, Removing, or Disabling Indexes


Adding, removing, or disabling an index will clear all of the missing index requests for that table. If you’re working through several index changes on the same table, make sure you script them all out before making any.

Trivial Plans


If a plan is simple enough, and the index access choice is obvious enough, and the cost is low enough, you’ll get a trivial plan.

This effectively means there were no cost based decisions for the optimizer to make.

Via Paul White:

The details of which types of query can benefit from Trivial Plan change frequently, but things like joins, subqueries, and inequality predicates generally prevent this optimization.

When a plan is trivial, additional optimization phases are not explored, and missing indexes are not requested.

See the difference between these queries and their plans:

SELECT * FROM dbo.Users AS u WHERE u.Reputation = 2; 

SELECT * FROM dbo.Users AS u WHERE u.Reputation = 2 AND 1 = (SELECT 1);

 

NUTS

The first plan is trivial, and no request is shown. There may be cases where bugs prevent missing indexes from appearing in query plans; they are usually more reliably logged in the missing index DMVs, though.

SARGability


Predicates where the optimizer wouldn’t be able to use an index efficiently even with an index may prevent them from being logged.

Things that are generally not SARGable are:

  • Columns wrapped in functions
  • Column + SomeValue = SomePredicate
  • Column + AnotherColumn = SomePredicate
  • Column = @Variable OR @Variable IS NULL

Examples:


SELECT * FROM dbo.Users AS u WHERE ISNULL(u.Age, 1000) > 1000; 

SELECT * FROM dbo.Users AS u WHERE DATEDIFF(DAY, u.CreationDate, u.LastAccessDate) > 5000;

SELECT * FROM dbo.Users AS u WHERE u.UpVotes + u.DownVotes > 10000000; 

DECLARE @ThisWillHappenWithStoredProcedureParametersToo NVARCHAR(40) = N'Eggs McLaren';
SELECT * 
FROM dbo.Users AS u 
WHERE u.DisplayName LIKE @ThisWillHappenWithStoredProcedureParametersToo OR @ThisWillHappenWithStoredProcedureParametersToo IS NULL;

 

None of these queries will register missing index requests. For more information on these, check out the following links:

You Already Have An Okay Index


Take this index:

CREATE INDEX ix_whatever ON dbo.Posts(CreationDate, Score) INCLUDE(OwnerUserId);

It looks okay for this query:

SELECT p.OwnerUserId, p.Score 
FROM dbo.Posts AS p 
WHERE p.CreationDate >= '20070101' 
AND p.CreationDate < '20181231' 
AND p.Score >= 25000 
AND 1 = (SELECT 1) 
ORDER BY p.Score DESC;

 

The plan is a simple Seek…

NUTS

But because the leading key column is for the less-selective predicate, we end up doing more work than we should:

Table ‘Posts’. Scan count 13, logical reads 136890

If we change the index key column order, we do a lot less work:

CREATE INDEX ix_whatever ON dbo.Posts(Score, CreationDate) INCLUDE(OwnerUserId);

NUTS

And significantly fewer reads:

Table ‘Posts’. Scan count 1, logical reads 5

SQL Server Is Creating Indexes For you


In certain cases, SQL Server will choose to create an index on the fly via an index spool. When an index spool is present, a missing index request won’t be. Surely adding the index yourself could be a good idea, but don’t count on SQL Server helping you figure that out.

NUTS

Thanks for reading!

SQL Saturday Washington, DC: Two Weeks To Go!

HELLO GUAM!


If you’re planning on attending SQL Saturday Washington, DC, why not play hooky from work and spend the day with me learning about all the atrocities SQL Server is capable of?

I’m going to be delivering my Total Server Tuning material, which has been a hit at a whole bunch of events this past year.

Attendees has a choice to either follow along with me on their laptops, or just watch in horror as familiar events unfold before their very eyes.

If you want to follow along, grab a copy of the StackOverflow2013 database. It’s about a 10GB download, which turns into a ~60GB database.

Fair warning: if you’re gonna follow along, you’re gonna have a tough time on skimpy laptop hardware. My personal laptop is 64GB of RAM and some pretty fast cores. At least they were until Intel started patching things. Most demos are on SQL Server 2017, but I’m going to be showing you stuff from SQL Server 2019 as well.

For a limited time, use the coupon code “votesql” for $50 off.

SQL Saturday Washington, DC: Three Weeks To Go!

HELLO GUAM!


If you’re planning on attending SQL Saturday Washington, DC, why not play hooky from work and spend the day with me learning about all the atrocities SQL Server is capable of?

I’m going to be delivering my Total Server Tuning material, which has been a hit at a whole bunch of events this past year.

In case you’ve never seen it, here’s the trailer I recorded for SQLBits earlier this year for it:

See you there!

For a limited time, use the coupon code “votesql” for $50 off.

Announcing My Precon For SQL Saturday Washington, DC

HELLO GUAM!


If you’re planning on attending SQL Saturday Washington, DC, why not play hooky from work and spend the day with me learning about all the atrocities SQL Server is capable of?

I’m going to be delivering my Total Server Tuning material, which has been a hit at a whole bunch of events this past year. It’s an eye-opening full day of training where you’ll find out all my favorite ways that things can go wrong with SQL Server hardware, queries, and indexes.

And of course, how you can outsmart SQL Server.

Which is pretty hard.

Like, doctors work on it and stuff.

See you there!

For a limited time, use the coupon code “votesql” for $50 off.

SQL Saturday Portland: One Week To Go!

Training Daze


Right now, we’re only one week out from SQL Saturday Portland.

I’ll be delivering my Total Server Tuning session, where you’ll learn all sorts of horrible things about SQL Server.

To get ready for the session, you’ll need a local version of SQL Server Developer Edition, 2017 + whatever the latest CU is.

I’m writing this a few weeks out, and who knows what kind of antics Microsoft will get up to.

Will they release, un-release, re-release, re-un-release, and then delete all internet history of another CU?

STAY TUNED!

For Sure


One thing you’ll definitely need is a copy of the StackOverflow2013 database.

Seats are limited, so hurry on up and get yourself one before you get FOMO in your MOJO.

Thanks for reading!