How SQL Server Can Handle Complex Queries (Sometimes)

Optimizer Optimizes


Sometimes, the optimizer can take a query with a complex where clause, and turn it into two queries.

This only happens up to a certain point in complexity, and only if you have really specific indexes to allow these kinds of plan choices.

Here’s a haphazard query:

SELECT COUNT(*) AS records
FROM dbo.Posts AS p
WHERE ( 
           p.PostTypeId = 1
       AND p.AcceptedAnswerId <> 0
       AND p.CommentCount > 5
       AND p.CommunityOwnedDate IS NULL
       AND p.FavoriteCount > 0
      )
OR   (
          p.PostTypeId = 2
      AND p.CommentCount > 1
      AND p.LastEditDate IS NULL
      AND p.Score > 5
      AND p.ParentId = 0
     )
AND (p.ClosedDate IS NULL);

There’s a [bunch of predicates], an OR, then a [bunch of predicates]. Since there’s some shared spaced, we can create an okay general index.

It’s pretty wide, and it may not be the kind of index I’d normally create, unless I really had to.

CREATE INDEX ix_whatever 
    ON dbo.Posts (PostTypeId, CommentCount, ParentId)
         INCLUDE(AcceptedAnswerId, FavoriteCount, LastEditDate, Score, ClosedDate, CommunityOwnedDate);

It covers every column we’re using. It’s a lot. But I had to do it to show you this.

Computer Love

The optimizer took each separate group of predicates, and turned it into a separate index access, with a union operator.

It’s like if you wrote two count queries, and then counted the results of both.

But With A Twist


Let’s tweak the where clause a little bit.

SELECT COUNT(*) AS records
FROM dbo.Posts AS p
WHERE ( 
           p.PostTypeId = 1
       AND p.AcceptedAnswerId <> 0
       AND p.CommentCount > 5
       OR p.CommunityOwnedDate IS NULL --This is an OR now
       AND p.FavoriteCount > 0
      )
OR   (
          p.PostTypeId = 2
      AND p.CommentCount > 1
      AND p.LastEditDate IS NULL
      OR p.Score > 5 -- This is an OR now
      AND p.ParentId = 0
     )
AND (p.ClosedDate IS NULL)
Wham!

We don’t get the two seeks anymore. We get one big scan.

Is One Better?


The two seek plan has this profile:

Table 'Posts'. Scan count 10, logical reads 30678
Table 'Worktable'. Scan count 0, logical reads 0
Table 'Workfile'. Scan count 0, logical reads 0

 SQL Server Execution Times:
   CPU time = 439 ms,  elapsed time = 108 ms.

Here’s the scan plan profile:

Table 'Posts'. Scan count 5, logical reads 127472

 SQL Server Execution Times:
   CPU time = 4624 ms,  elapsed time = 1617 ms.

In this case, the index union optimization works in our favor.

We can push the optimizer towards a plan like that by breaking up complicated where clauses.

SELECT COUNT(*)
FROM (
SELECT 1 AS x
FROM dbo.Posts AS p
WHERE ( 
           p.PostTypeId = 1
       AND p.AcceptedAnswerId <> 0
       AND p.CommentCount > 5
       AND p.CommunityOwnedDate IS NULL
       AND p.FavoriteCount > 0
      )  

UNION ALL

SELECT 1 AS x
FROM dbo.Posts AS p   
WHERE (
          p.PostTypeId = 2
      AND p.CommentCount > 1
      AND p.LastEditDate IS NULL
      AND p.Score > 5
      AND p.ParentId = 0
     )
AND (p.ClosedDate IS NULL)
) AS x

Et voila!

Chicken Leg

Which has this profile:

Table 'Posts'. Scan count 2, logical reads 30001

 SQL Server Execution Times:
   CPU time = 329 ms,  elapsed time = 329 ms.

Beat My Guest


The optimizer is full of all sorts of cool tricks.

The better your indexes are, and the more clearly you write your queries, the more of those tricks you might see it start using

Thanks for reading!



One thought on “How SQL Server Can Handle Complex Queries (Sometimes)

Leave a Reply

Your email address will not be published. Required fields are marked *