Why You Can’t Always Rely On Missing Index Requests

Greentexting


The problem with relying on any data point is that when it’s not there, it can look like there’s nothing to see.

Missing indexes requests are one of those data points. Even though there are many reasons why they might not be there, sometimes it’s not terribly clear why one might not surface.

That can be annoying if you’re trying to do a general round of tuning on a server, because you can miss some easy opportunities to make improvements.

Here’s an example of a query that, with no indexes in place, probably should generate a missing index request.

SELECT TOP (5000) 
    p.OwnerUserId,
    p.Score,
    ISNULL(p.Tags, N'N/A: Question') AS Tags,
    ISNULL(p.Title, N'N/A: Question') AS Title, 
    p.CreationDate, 
    p.LastActivityDate, 
    p.Body
FROM dbo.Posts AS p
WHERE 1 = 1 
AND p.CreationDate >= '20131230'
AND p.CreationDate < '20140101'
ORDER BY p.Score DESC;

Big Ol’ Blank


Here’s the query plan! It’s like uh. Why wouldn’t you want this to take less than 25 seconds?

clap your hands

The posts table is a little over 17 million rows. The optimizer expects around 20k rows to qualify, but doesn’t think an easier way to find those rows would be helpful.

At least not the way we’ve written the query.

Let’s make a small change

Five and Dime


If we quote out the Body column, which is an NVARCHAR(MAX), we get our green text.

SELECT TOP (5000) 
    p.OwnerUserId,
    p.Score,
    ISNULL(p.Tags, N'N/A: Question') AS Tags,
    ISNULL(p.Title, N'N/A: Question') AS Title, 
    p.CreationDate, 
    p.LastActivityDate--, 
    --p.Body
FROM dbo.Posts AS p
WHERE 1 = 1 
AND p.CreationDate >= '20131230'
AND p.CreationDate < '20140101'
ORDER BY p.Score DESC;
Who’d want that in an index?

Which is interesting, because the optimizer isn’t always that smart. It’s much easier to tempt it into bad ideas with equality predicates.

Good and Hard


Check this out!

SELECT TOP (5000) *
FROM dbo.Posts AS p
WHERE p.ParentId = 184618;


SELECT TOP (5000) *
FROM dbo.Posts AS p
WHERE p.ParentId > 184617
AND   p.ParentId < 184619;
hot cars

The missing index for this is a mistake.

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Posts] ([ParentId])
INCLUDE ([AcceptedAnswerId],[AnswerCount],[Body],[ClosedDate],[CommentCount],[CommunityOwnedDate],[CreationDate],[FavoriteCount],[LastActivityDate],[LastEditDate],[LastEditorDisplayName],[LastEditorUserId],[OwnerUserId],[PostTypeId],[Score],[Tags],[Title],[ViewCount])

What Did We Learn?


How we write queries (and design tables) can change how the optimizer feels about our queries. If you’re the kind of person who relies on missing index requests to fix things, you could be missing pretty big parts of the picture.

Thanks for reading!



One thought on “Why You Can’t Always Rely On Missing Index Requests

Leave a Reply

Your email address will not be published. Required fields are marked *