We’re going to start this week off by using a computed column to fix a non-SARGable query, because there are a few interesting side quests to the scenario.
Here’s the starting query, which has a few different problems:
SELECT c = COUNT_BIG(*) FROM dbo.Posts AS p WHERE LEN(p.Body) < 200 AND p.PostTypeId IN (1, 2);
Let’s say we’re doing this to audit short questions and answers for quality.
Since SQL Server doesn’t retain any precise data about string column lengths, we don’t have an effective way to implement this search.
Worse, since the Body column is a max datatype, no expression (SARGable or not) can be pushed to the index scan.
The query plan shows us a full scan of the clustered index where the filters on PostTypeId are applied, and later on a filter operator that applies the len filter:
This is an ugly query, and if it’s one that we were going to make part of a regular review process, we probably don’t want users to sit around waiting 42 seconds on this every single time.
To get this query cranking, we need to add a computed column — note that it doesn’t need to be persisted — and index it.
ALTER TABLE dbo.Posts ADD BodyLen AS CONVERT ( bigint, LEN(Body) ); CREATE INDEX bodied ON dbo.Posts (BodyLen, PostTypeId);
Now our query looks like this:
A lot of people are afraid of computed columns, because they think that they need to be persisted in order to get statistics generated on them, or to index them. You very much do not.
The persisted attribute will write the results of the expression to the clustered index or heap, which can cause lots of locking and logging and trouble.
Indexing the computed column writes the results only to the nonclustered index as it’s created, which is far less painful.
Tomorrow, we’ll look at how we can use temp tables to fix issues with SARGability.
Thanks for reading!