SARGability Week: Using Computed Columns To Fix Non-SARGable Queries

Adventure


We’re going to start this week off by using a computed column to fix a non-SARGable query, because there are a few interesting side quests to the scenario.

Here’s the starting query, which has a few different problems:

SELECT
    c = COUNT_BIG(*)
FROM dbo.Posts AS p
WHERE LEN(p.Body) < 200
AND   p.PostTypeId IN (1, 2);

Let’s say we’re doing this to audit short questions and answers for quality.

Since SQL Server doesn’t retain any precise data about string column lengths, we don’t have an effective way to implement this search.

Worse, since the Body column is a max datatype, no expression (SARGable or not) can be pushed to the index scan.

Maxed Out


The query plan shows us a full scan of the clustered index where the filters on PostTypeId are applied, and later on a filter operator that applies the len filter:

falter

This is an ugly query, and if it’s one that we were going to make part of a regular review process, we probably don’t want users to sit around waiting 42 seconds on this every single time.

Getaround


To get this query cranking, we need to add a computed column — note that it doesn’t need to be persisted — and index it.

ALTER TABLE dbo.Posts
    ADD BodyLen AS 
        CONVERT
        (
            bigint, 
            LEN(Body)
        );

CREATE INDEX bodied ON dbo.Posts
    (BodyLen, PostTypeId);

Now our query looks like this:

bettered

Prize Money


A lot of people are afraid of computed columns, because they think that they need to be persisted in order to get statistics generated on them, or to index them. You very much do not.

The persisted attribute will write the results of the expression to the clustered index or heap, which can cause lots of locking and logging and trouble.

Indexing the computed column writes the results only to the nonclustered index as it’s created, which is far less painful.

Tomorrow, we’ll look at how we can use temp tables to fix issues with SARGability.

Thanks for reading!



3 thoughts on “SARGability Week: Using Computed Columns To Fix Non-SARGable Queries

  1. Is there a slick way to implement this magic when GetDate() is involved? Many of my queries are calculating the time a transaction is open- so if its closed its easy to Datediff the date the transaction began vs the date it was closed, but if its still open then we use the nondeterministic GetDate(). Using a temp table isn’t an option since I can’t rewrite the query.

    Also when persisting the column you said this can cause pain in locking- I presume that’s a one time expense when the calculation is performed right? Flipping the persisted switch on a big table would be very painful once to calculate everything on the table but afterwards not so bad?

    Thanks!

    1. For the first thing, no, but that’s usually where people use a canary value like 9999-12-31 rather than getdate to work around the issue.

      For the second thing, yeah, it’s a one time pain.

      Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *