How Bad Cardinality Estimates Lead To Bad Query Plan Choices: Does Batch Mode Help?

Notes and Notability


In the last post, I showed you a query where a bad plan was chosen because of bad guesses.

In this post, I’m going to see if Batch Mode will help anything.

To do that, we’re going to use an empty temp table with a clustered columnstore index on it.

    CREATE TABLE #t (id INT NOT NULL, 
	                 INDEX c CLUSTERED COLUMNSTORE);

First Up, No Hints


    SELECT   p.*
    FROM     dbo.Posts AS p
    JOIN     dbo.Votes AS v
        ON p.Id = v.PostId
	LEFT JOIN #t ON 1 = 0
    WHERE    p.PostTypeId = 2
    AND      p.CreationDate >= '20131225'
    ORDER BY p.Id;
SQL Server Query Plan
Deeply Unsatisfying

Though this plan is ~6 seconds faster than the Merge Join equivalent in the last post, that’s not the kind of improvement I’m shooting for.

Remember than the Hash Join plan in Row Mode ran in 2.6 seconds.

The only operator to run in Batch Mode here is the Sort. To be fair, it’s really the only one eligible with the trick we used.

Forcing this plan to run in parallel, we go back to a 27 second runtime with no operators in Batch Mode.

Next Up, Hash Hint


Disappointingly, this gets worse. The Row Mode only plan was 2.6 seconds, and this is 6.8 seconds.

SQL Server Query Plan
Hush now

The answer to why the Batch Mode plan is 3x slower lies in our Row Mode plan. Let’s look at them head to head.

SQL Server Query Plan
Pick me!

See that Repartition Streams operator? It literally saves the entire query.

The Batch Mode plan doesn’t get one. Because of that, Bad Things Happen™

Skew Job


Both plans start out with an unfortunate scan of the Posts table.

SQL Server Query Plan
Sucktown

But in the Row Mode plan, Repartition Streams does exactly what it sounds like it does, and balances things out. The number of rows on each thread is even because of this. Crazy to think about, but threads dividing work up evenly is, like, a good thing.

In the Batch Mode plan, that… doesn’t happen.

SQL Server Query Plan
Apology dog

 

We have the dreaded serial parallel query. Aside from the Scan of the Votes table, only one thread is ever active across the rest of the query.

This isn’t a limitation of Batch Mode generally, though I suspect it has something to do with why we don’t see Repartition Streams show up.

One limitation of Batch Mode is with Sorts — they are single threaded — a point this particular demo obfuscates, unless they’re a child operator of a Window Aggregate.

Well, Darn


I was expecting some fireworks there.

Maybe 2019 will help?

In the next post, we’ll look at that.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.



One thought on “How Bad Cardinality Estimates Lead To Bad Query Plan Choices: Does Batch Mode Help?

Comments are closed.