Bad Guesses and Bad Choices: Does Tricky Batch Mode Help?

Notes and Notability


In the last post, I showed you a query where a bad plan was chosen because of bad guesses.

In this post, I’m going to see if Batch Mode will help anything.

To do that, we’re going to use an empty temp table with a clustered columnstore index on it.

    CREATE TABLE #t (id INT NOT NULL, 
	                 INDEX c CLUSTERED COLUMNSTORE);

First Up, No Hints


    SELECT   p.*
    FROM     dbo.Posts AS p
    JOIN     dbo.Votes AS v
        ON p.Id = v.PostId
	LEFT JOIN #t ON 1 = 0
    WHERE    p.PostTypeId = 2
    AND      p.CreationDate >= '20131225'
    ORDER BY p.Id;
Deeply Unsatisfying

Though this plan is ~6 seconds faster than the Merge Join equivalent in the last post, that’s not the kind of improvement I’m shooting for.

Remember than the Hash Join plan in Row Mode ran in 2.6 seconds.

The only operator to run in Batch Mode here is the Sort. To be fair, it’s really the only one eligible with the trick we used.

Forcing this plan to run in parallel, we go back to a 27 second runtime with no operators in Batch Mode.

Next Up, Hash Hint


Disappointingly, this gets worse. The Row Mode only plan was 2.6 seconds, and this is 6.8 seconds.

Hush now

The answer to why the Batch Mode plan is 3x slower lies in our Row Mode plan. Let’s look at them head to head.

Pick me!

See that Repartition Streams operator? It literally saves the entire query.

The Batch Mode plan doesn’t get one. Because of that, Bad Things Happen™

Skew Job


Both plans start out with an unfortunate scan of the Posts table.

Sucktown

But in the Row Mode plan, Repartition Streams does exactly what it sounds like it does, and balances things out. The number of rows on each thread is even because of this. Crazy to think about, but threads dividing work up evenly is, like, a good thing.

In the Batch Mode plan, that… doesn’t happen.

Apology dog

 

We have the dreaded serial parallel query. Aside from the Scan of the Votes table, only one thread is ever active across the rest of the query.

This isn’t a limitation of Batch Mode generally, though I suspect it has something to do with why we don’t see Repartition Streams show up.

One limitation of Batch Mode is with Sorts — they are single threaded — a point this particular demo obfuscates, unless they’re a child operator of a Window Aggregate.

Well, Darn


I was expecting some fireworks there.

Maybe 2019 will help?

In the next post, we’ll look at that.

Thanks for reading!



One thought on “Bad Guesses and Bad Choices: Does Tricky Batch Mode Help?

Leave a Reply

Your email address will not be published. Required fields are marked *