To illustrate this problem a little bit better, I need to use a slightly different query.
While a tidy group by was nice to illustrate some of the simpler points about lookups, it overly complicates things when we want to involve other columns.
The second we need anything else in the select or order by portions of a query, we need to apply aggregates to them, or include them in the group by.
And you see, once you set up a query to return the TOP N rows, there’s an expectation that users get to choose the order they start seeing rows in. As long as we stick to columns whose ordering is supported by an index, things will be pretty stable.
Once we go outside that, a TOP can be rough on a query.
Order By CreationDate
Even if we order by CreationDate in descending order, with the index created in ascending order, we don’t need an explicit sort operation to put data in order.
SELECT TOP (1000) p.CreationDate, p.OwnerUserId, p.Score FROM dbo.Posts AS p WHERE p.CreationDate >= '20131015' ORDER BY p.CreationDate DESC;
There are some additional possibilities for this kind of thing with multi-key indexes that we’ll talk about later in the series, but for now this is a good enough illustration of indexes putting data in order.
Order By Score
Let’s ask for data in a different order now. Score is a convenient villain, because there are lots of times when you might want to see things by a highest whatever metric is commendable to be high. Like not blood pressure, probably.
SELECT TOP (1000) p.CreationDate, p.OwnerUserId, p.Score FROM dbo.Posts AS p WHERE p.CreationDate >= '20131015' ORDER BY p.Score DESC;
Without an index that has Score in the key, we need to physically put the data in order to fit the requirements of the query. Note that in this case, the optimizer no longer sees any benefit to using our nonclustered index.
Why Do We Care About Sorts?
Without jumping too far ahead, Sorts need extra memory to run.
You know, that stuff you cache data in, and you don’t have enough of already?
Yeah, that memory.
The same memory that gives you a conniption every time PLE fluctuates.
How much they ask for is going to depend on:
- How much memory you have
- What max server memory is set to
- The size of the data you need to sort
If we aren’t able to get enough memory, or if we don’t ask for enough up front, data could spill from memory to disk.
When spills get big enough, they can sometimes cause performance issues.
This is a good time to start talking a little bit more about indexes, so we can understand more about how they work and help us solve problems.
Thanks for reading!
For the entire month of August, you can get 75% off my recorded video training with the coupon code “cruelsummer”