Sneaky Ordering With Row Numbers

Puzzle Pieces


I was investigating a query slowdown recently, and came across something kind of odd with windowing functions and order by.

Let’s talk about these three queries:

SELECT u.DisplayName,
       ROW_NUMBER() OVER (ORDER BY u.UpVotes) AS UpVotesWhatever, --UpVotes first
       ROW_NUMBER() OVER (ORDER BY u.DownVotes) AS DownVotesWhatever --DownVotes second
FROM dbo.Users AS u
WHERE u.Reputation > 100000
ORDER BY UpVotes; --Order by UpVotes

SELECT u.DisplayName,
       ROW_NUMBER() OVER (ORDER BY u.UpVotes) AS UpVotesWhatever, --UpVotes first
       ROW_NUMBER() OVER (ORDER BY u.DownVotes) AS DownVotesWhatever --DownVotes second
FROM dbo.Users AS u
WHERE u.Reputation > 100000
ORDER BY u.DownVotes; --Order by DownVotes

SELECT u.DisplayName,
       ROW_NUMBER() OVER (ORDER BY u.DownVotes) AS DownVotesWhatever, --DownVotes first
       ROW_NUMBER() OVER (ORDER BY u.UpVotes) AS UpVotesWhatever --UpVotes second
FROM dbo.Users AS u
WHERE u.Reputation > 100000
ORDER BY UpVotes; --Order by UpVotes

Goings On


If we’re going to generate row numbers on these columns, we need to sort them.

I know and you know, we can add indexes to put column data in the order we want it in, and that’ll cut down on the amount of work Our Serverâ„¢ has to do to execute this query. But we can’t just index everything, that’d be insane. I know because I’ve seen your servers, and I’ve seen you try to do that.

Plus, they just get fragmented anyway.

Here are the execution plans. This is a big picture, because I want you to spot the difference.

get big

Fascination Street


That first plan has an extra Sort operator in it. See it up there? Right next to the Select operator?

shame on you

That sort is ordering by UpVotes ascending, which is a shame because we’ve already done that once. That sort doesn’t occur in the second two plans, because the row number function has already sorted data by them. If the optimizer were a little smarter here, it could reorder the sequence it generates row numbers in to avoid that, but it doesn’t.

If we rewrite the query to do that on our own, the data ends up in the right order. In case you’re wondering, we get the same results referencing the row numbers in the order by instead of the underlying column:

SELECT u.DisplayName,
       ROW_NUMBER() OVER (ORDER BY u.UpVotes) AS UpVotesWhatever, --UpVotes first
       ROW_NUMBER() OVER (ORDER BY u.DownVotes) AS DownVotesWhatever --DownVotes second
FROM dbo.Users AS u
WHERE u.Reputation > 100000
ORDER BY UpVotesWhatever; --Order by UpVotesWhatever

SELECT u.DisplayName,
       ROW_NUMBER() OVER (ORDER BY u.UpVotes) AS UpVotesWhatever, --UpVotes first
       ROW_NUMBER() OVER (ORDER BY u.DownVotes) AS DownVotesWhatever --DownVotes second
FROM dbo.Users AS u
WHERE u.Reputation > 100000
ORDER BY DownVotesWhatever; --Order by DownVotesWhatever

SELECT u.DisplayName,
       ROW_NUMBER() OVER (ORDER BY u.DownVotes) AS DownVotesWhatever, --DownVotes first
       ROW_NUMBER() OVER (ORDER BY u.UpVotes) AS UpVotesWhatever --UpVotes second
FROM dbo.Users AS u
WHERE u.Reputation > 100000
ORDER BY UpVotesWhatever; --Order by UpVotesWhatever

Thanks for reading!



2 thoughts on “Sneaky Ordering With Row Numbers

  1. Heh… and people continue to say that the order in which the query is built doesn’t matter. I never took the time to try to demonstrably prove that it does. This is certainly demonstrable proof that it does. Thanks, Eric.

Leave a Reply

Your email address will not be published. Required fields are marked *