Local Variables Get Bad Cardinality Estimates In Cursors In SQL Server, Too

Work Off


While helping a client out with weird performance issues, we isolated part of the code that was producing a whole bunch of bad plans.

At the intersection of bad ideas, there was a cursor looping over a table gathering some data points with a local variable in the where clause.

For more background on that, check out these posts:

One might expect the query inside the cursor to be subject to some special rules, but alas, nothing good comes of it.

Eau de Sample


To repro a little bit, we need to create a certainly suboptimal index.

CREATE INDEX p ON dbo.Posts
    (OwnerUserId);

If you have a lot of single key column indexes, you’re probably doing indexing wrong.

Full demo code is at the end because it’s a bit long, but the end result is five query plans that all share the same bad estimate based off the density vector.

The worst of them looks like this:

SQL Server Query Plan
Occurling

And the best of them looks like this:

SQL Server Query Plan
Gruntled

Over and Over


If you’re getting bad guesses like this over and over again in any loop-driven code, local variables might just be to blame.

That guess of 10 rows of course comes from  this calculation:

SELECT 
    density = 
        (
            1 / 
            CONVERT
            (
                float, 
                COUNT(DISTINCT p.OwnerUserId)
            )
        ) *
        COUNT_BIG(*)
FROM Posts AS p

Which, with a little bit of rounding, gets us to the estimate we see in the query plan:

SQL Server Query Plan
hectic

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Demo Code


SET STATISTICS XML OFF;
SET NOCOUNT ON;

DECLARE 
    @CurrentId int;

DROP TABLE IF EXISTS #UserIds; 

CREATE TABLE #UserIds 
(
    UserId int PRIMARY KEY CLUSTERED
);

INSERT
    #UserIds WITH(TABLOCK)
(
    UserId
)
SELECT
    u.Id
FROM dbo.Users AS u
WHERE u.Reputation > 850000
ORDER BY u.Reputation DESC;


DECLARE counter CURSOR 
    LOCAL STATIC
FOR
SELECT 
    UserId 
FROM #UserIds;
 
OPEN counter;

FETCH NEXT FROM counter 
    INTO @CurrentId;

WHILE @@FETCH_STATUS = 0
BEGIN 
   
    SET STATISTICS XML ON;
    
    SELECT
        p.PostTypeId,
        c = COUNT_BIG(*)
    FROM dbo.Posts AS p
    WHERE p.OwnerUserId = @CurrentId
    GROUP BY p.PostTypeId
    ORDER BY c DESC;
    
    SET STATISTICS XML OFF;
 
FETCH NEXT FROM counter 
    INTO @CurrentId;
END;

CLOSE counter;
DEALLOCATE counter;

 

Why Expressions Are Better Than Local Variables For Performance In SQL Server Queries

Often Enough


I see people writing stored procedures that use local variables either because they read that all functions in a where clause are dangerous, or because they need to perform a calculation on one of the parameters that get passed in.

In nearly all cases, you’re better off… not doing that.

Here are a couple examples of why.

Hard Yes


Passing in a function like GETDATE, and even passing it to another function like DATEADD, the optimizer can interpret them to their current values and make a pretty decent guess based on them.

SQL Server Query Plan
goo

This is a Good Enough™ guess.

Hard Pass


Once you assign that function to a value, everything gets awkward.

SQL Server Query Plan
bang bang

That’s a Pretty Bad™ guess, but not the end of the world. What’s bad is that the guess never changes even if you change the span of time.

Look what happens if we just add one day instead of one year.

SQL Server Query Plan
one day at a time

We get the exact same guess as before — 821,584 rows. Bummer.

Storied Tradition


It’s nearly the same thing in stored procedures. What I see more often is people there is people passing in one parameter for a start date, and then using a local variable to figure out an end date.

CREATE OR ALTER PROCEDURE dbo.dangit_bobby
(
    @start_date datetime
)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

DECLARE
    @later datetime = DATEADD(DAY, 1, GETDATE());

    SELECT
        records = 
            COUNT_BIG(*)
    FROM dbo.express_yourself AS ey
    WHERE ey.some_date >= @start_date
    AND   ey.some_date <  @later;

END;
GO

EXEC dbo.dangit_bobby 
    @start_date = '2021-05-19';
GO
SQL Server Query Plan
FAR OFF DUDE

Advanced Calculus


Let’s change how we use the parameter, and put it into some date math in the where clause instead.

CREATE OR ALTER PROCEDURE dbo.dangit_bobby
(
    @start_date datetime
)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

    SELECT
        records = 
            COUNT_BIG(*)
    FROM dbo.express_yourself AS ey
    WHERE ey.some_date >= @start_date
    AND   ey.some_date <  DATEADD(DAY, 1, @start_date);

END;
GO 
EXEC dbo.dangit_bobby 
    @start_date = '2021-05-19';
GO

We get a Much Closer™ estimate. What a lovely day.

SQL Server Query Plan
go for both

Hardest Part


It’s often tempting to take shortcuts like this, especially if you need to reuse the same calculation multiple times.

Now, look, if you stare closely enough at these plans, you’ll notice that they all finish quickly enough. That’s because I have a small table with a wonderful index on the column I care about.

The point here is to show you how bad estimates can turn up even in ideal circumstances when you do things The Wrong Way™.

As long as you’re not wrapping columns in functions like this, you’re probably okay.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Things SQL Server vNext Should Address: Table Variable Modification Performance

Canard


People still tell me things like “I only put 100 rows in table variables”, and think that’s the only consideration for their use.

There are definitely times when table variables can be better, but 100 rows is meaningless.

Even if you put one row in a table variable it can fudge up performance because SQL Server doesn’t know what’s in your table variable. That’s still true in SQL Server 2019, even if the optimizer knows how many rows are in your table variable.

The problem that you can run into, even with just getting 100 rows into a table variable, is that it might take a lot of work to get those 100 rows.

Bernard


I’ve blogged before about workarounds for this problem, but the issue remains that inserts, updates, and deletes against table variables aren’t naturally allowed to go parallel.

The reason why is a bit of a mystery to me, since table variables are all backed by temp tables anyway. If you run this code locally, you’ll see what I mean:

SET NOCOUNT ON;
SET STATISTICS IO ON;
DECLARE @t table(id int);
SELECT * FROM @t AS t;
SET STATISTICS IO OFF;

Over in the messages tab you’ll see something like this:

Table '#B7A53B3E'. Scan count 1, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

Now, look, I’m not asking for update or delete portions of the query plan to go parallel, but it might be nice if other child operators could go parallel. That’s how things go with regular tables and #temp tables. It would be nice if inserts could go parallel, but hey

Ardbeg


The problem this solves is one that I see often, usually from vendor code where the choice of which temporary object to use was dependent on individual developer preference, or they fell for the meme that table variables are “in memory” or something. Maybe the choice was immaterial at first with low data volume, and over time performance slowly degraded.

If I’m allowed to change things, it’s easy enough to replace @table variables with #temp tables, or use a workaround like from the above linked post about them to improve performance. But when I’m not, clients are often left begging vendors to make changes, who aren’t receptive.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Things SQL Server vNext Should Address: Common Table Expression Materialization

Repetition Is Everything


I know what you’re thinking: this is another post that asks for a hint to materialize CTEs.

You’re wrong. I don’t want another hint that I can’t add to queries to solve a problem because the code is coming from a vendor or ORM.

No, I want the optimizer to smarten up about this sort of thing, detect CTE re-use, and use one of the New And Improved Spools™ to cache results.

Let’s take a look at where this would come in handy.

Standalone


If we take this query by itself and look at the execution plan, it conveniently shows one access of Posts and Users, and a single join between the two.

SELECT
    u.Id AS UserId,
    u.DisplayName,
    p.Id AS PostId,
    p.AcceptedAnswerId,
    TotalScore = 
        SUM(p.Score)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
    ON p.OwnerUserId = u.Id
WHERE u.Reputation > 100000
GROUP BY 
    u.Id, 
    u.DisplayName,
    p.Id, 
    p.AcceptedAnswerId
HAVING SUM(p.Score) > 1
SQL Server Query Plan
invitational

Now, let’s go MAKE THINGS MORE READABLE!!!

Ality


WITH spool_me AS
(
    SELECT
        u.Id AS UserId,
        u.DisplayName,
        p.Id AS PostId,
        p.AcceptedAnswerId,
        TotalScore = 
            SUM(p.Score)
    FROM dbo.Users AS u
    JOIN dbo.Posts AS p
        ON p.OwnerUserId = u.Id
    WHERE u.Reputation > 100000
    GROUP BY 
        u.Id, 
        u.DisplayName,
        p.Id, 
        p.AcceptedAnswerId
    HAVING SUM(p.Score) > 1
)
SELECT
    a.UserId,
    a.DisplayName,
    a.PostId,
    a.AcceptedAnswerId,
    a.TotalScore,
    q.UserId,
    q.DisplayName,
    q.PostId,
    q.AcceptedAnswerId,
    q.TotalScore
FROM spool_me AS a
JOIN spool_me AS q
    ON a.PostId = q.AcceptedAnswerId
ORDER BY a.TotalScore DESC;

Wowee. We really done did it. But now what does the query plan look like?

SQL Server Query Plan
oh, you

There are now two accesses of Posts and two accesses of Users, and three joins (one Hash Join isn’t in the screen cap).

Detection


Obviously, the optimizer knows it has to build a query plan that reflects the CTE being joined.

Since it’s smart enough to do that, it should be smart enough to use a Spool to cache things and prevent the additional accesses.

Comparatively, using a #temp table to simulate a Spool, is about twice as fast. Here’s the CTE plan:

SQL Server Query Plan
double

Here’s the Spool Simulator Plan™

SQL Server Query Plan
professionals

Given the optimizer’s penchant for spools, this would be another chance for it to shine on like the crazy diamond it is.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Things SQL Server vNext Should Address: Local Variable Estimates

And Then The World


I’ve written at length about what local variables do to queries, so I’m not going to go into it again here.

What I do want to talk about are better alternatives to what you currently have to do to fix issues:

  • RECOMPILE the query
  • Pass the local variable to a stored procedure
  • Pass the local variable to dynamic SQL

It’s not that I hate those options, they’re just tedious. Sometimes I’d like the benefit of recompiling with local variables without all the other strings that come attached to recompiling.

Hint Me Baby One More Time


Since I’m told people rely on this behavior to fix certain problems, you would probably need a few different places to and ways to alter this behavior:

  • Database level setting
  • Query Hint
  • Variable declaration

Database level settings are great for workloads you can’t alter, either because the queries come out of a black box, or you use an ORM and queries… come out of a nuclear disaster area.

Query hints are great if you want all local variables to be treated like parameters. But you may not want that all the time. I mean, look: you all do wacky things and you’re stuck in your ways. I’m not kink shaming here, but facts are facts.

You have to draw the line somewhere and that somewhere is always “furries”.

And then local variables.

It may also be useful to allow local variables to be declared with a special property that will allow the optimizer to treat them like parameters. Something like this would be easy enough:

DECLARE @p int PARAMETER = 1;

Hog Ground


Given that in SQL Server 2019 table variables got deferred compilation, I think this feature is doable.

Of course, it’s doable today if you’ve got a debugger and don’t mind editing memory space.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

When Should You Use Table Variables In SQL Server? When Queries Execute Thousands Of Times A Minute

Screech



Links:

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

USE StackOverflow2013;
EXEC dbo.DropIndexes;
SET NOCOUNT ON;
DBCC FREEPROCCACHE;
GO 

CREATE INDEX whatever 
ON dbo.Posts
    (OwnerUserId) 
INCLUDE
    (Score);
GO 

CREATE OR ALTER PROCEDURE dbo.TempTableTest (@Id INT)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;
    
    CREATE TABLE #t(i INT NOT NULL);

    INSERT 
        #t ( i )
    SELECT 
        p.Score
    FROM dbo.Posts AS p
    WHERE p.OwnerUserId = @Id;

END;
GO 

CREATE OR ALTER PROCEDURE dbo.TableVariableTest (@Id INT)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;
    
    DECLARE @t TABLE (i INT NOT NULL);

    INSERT 
        @t ( i )
    SELECT 
        p.Score
    FROM dbo.Posts AS p
    WHERE p.OwnerUserId = @Id;

END;
GO 


CREATE OR ALTER PROCEDURE dbo.TempTestWrapper (@TestProc sysname)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

DECLARE @rando int = 
    ((ABS(CHECKSUM(NEWID())) % 21195018) + 1); /*this is the max id in posts for SO2013*/

IF @TestProc = N'TempTableTest'
    BEGIN
        EXEC dbo.TempTableTest @rando;
    END;

IF @TestProc = N'TableVariableTest'
    BEGIN
        EXEC dbo.TableVariableTest @rando;
    END;

END;
GO 

/*Testing*/
EXEC dbo.TempTestWrapper @TestProc = 'TempTableTest';
EXEC dbo.TempTestWrapper @TestProc = 'TableVariableTest';

/*

ostress -SSQL2019S -d"StackOverflow2013" -Q"EXEC dbo.TempTestWrapper @TestProc = 'TempTableTest';" -U"ostress" -P"ostress" -q -n300 -r1000 -o"C:\temp\crap"

ostress -SSQL2019S -d"StackOverflow2013" -Q"EXEC dbo.TempTestWrapper @TestProc = 'TableVariableTest';" -U"ostress" -P"ostress" -q -n300 -r1000 -o"C:\temp\crap"

*/

SQL Server Spool Operators Are Just Crappy Temp Tables

But My Tempdb


Using the scenario from yesterday’s post as an example of why you might want to think about rewriting queries with Table Spools in them to use temp tables instead, look how the optimizer chooses a plan with an Eager Table Spool.

The “Eager” part means the entire set of rows is loaded into a temporary object at once.

SQL Server Query Plan
drugas

That’s a lot of rows, innit? Stick some commas in there, and you might just find yourself staring down the barrel of a nine digit number.

Worse, we spend a long time loading data into the spool, and doing so in a serial zone. There’s no good way to know exactly how long the load is because of odd operator times.

If you recall yesterday’s post, the plan never goes back to parallel after that, either. It runs for nearly 30 minutes in total.

Yes Your Tempdb


If you’re gonna be using that hunka chunka tempdb anyway, you might as well use it efficiently. Unless batch mode is an option for you, either as Batch Mode On Rowstore, or tricking the optimizer, this might be your best bet.

Keep in mind that Standard Edition users have an additional limitation where Batch Mode queries are limited to a DOP of 2, and don’t have access to Batch Mode On Rowstore as of this writing. The DOP limitation especially might make the trick unproductive compared to alternatives that allow for MOREDOP.

For example, if we dump that initial join into a temp table, it only takes about a minute to get loaded at a DOP of 8. That is faster than loading data into the spool (I mean, probably. Just look at that thing.).

SQL Server Query Plan
sweet valley high

The final query to do the distinct aggregations takes about 34 seconds.

SQL Server Query
lellarap

Another benefit is that each branch that does a distinct aggregation is largely in a parallel zone until the global aggregate.

SQL Server Query
muggers

In total, both queries finish in about a 1:45. A big improvement from nearly 30 minutes relying on the Eager Table Spool and processing all of the distinct aggregates in a serial zone. The temp table here doesn’t have that particular shortcoming.

In the past, I’ve talked a lot about Eager Index Spools. They have a lot of problems too, many of which are worse. Of course, we need indexes to fix those, not temp tables.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SELECT
    v.PostId,
    v.UserId,
    v.BountyAmount,
    v.VoteTypeId,
    v.CreationDate
INTO #better_spool
FROM dbo.Votes AS v
JOIN dbo.Posts AS p
    ON p.Id = v.PostId;

SELECT
    PostId = COUNT_BIG(DISTINCT s.PostId),
    UserId = COUNT_BIG(DISTINCT s.UserId), 
    BountyAmount = COUNT_BIG(DISTINCT s.BountyAmount), 
    VoteTypeId = COUNT_BIG(DISTINCT s.VoteTypeId), 
    CreationDate = COUNT_BIG(DISTINCT s.CreationDate)
FROM #better_spool AS s;

 

What’s Really Different About In-Memory Table Variables In SQL Server?

Kendra, Kendra, Kendra


My dear friend Kendra asked… Okay, look, I might have dreamed this. But I maybe dreamed that she asked what people’s Cost Threshold For Blogging™ is. Meaning, how many times do you have to get asked a question before you write about it.

I have now heard people talking and asking about in-memory table variables half a dozen times, so I guess here we are.

Talking about table variables.

In memory.

Yes, Have Some


First, yes, they do help relieve tempdb contention if you have code that executes under both high concurrency and frequency. And by high, I mean REALLY HIGH.

Like, Snoop Dogg high.

Because you can’t get rid of in memory stuff, I’m creating a separate database to test in.

Here’s how I’m doing it!

CREATE DATABASE trash;

ALTER DATABASE trash 
ADD FILEGROUP trashy 
    CONTAINS MEMORY_OPTIMIZED_DATA ;
     
ALTER DATABASE trash 
ADD FILE 
(
    NAME=trashcan, 
    FILENAME='D:\SQL2019\maggots'
) 
TO FILEGROUP trashy;

USE trash;

CREATE TYPE PostThing 
AS TABLE
(
    OwnerUserId int,
    Score int,
    INDEX o HASH(OwnerUserId)
    WITH(BUCKET_COUNT = 100)
) WITH
(
    MEMORY_OPTIMIZED = ON
);
GO

Here’s how I’m testing things:

CREATE OR ALTER PROCEDURE dbo.TableVariableTest(@Id INT)
AS
BEGIN

    SET NOCOUNT, XACT_ABORT ON;
    
    DECLARE @t AS PostThing;
    DECLARE @i INT;

    INSERT @t 
        ( OwnerUserId, Score )
    SELECT 
        p.OwnerUserId,
        p.Score
    FROM Crap.dbo.Posts AS p
    WHERE p.OwnerUserId = @Id;

    SELECT 
        @i = SUM(t.Score)
    FROM @t AS t
    WHERE t.OwnerUserId = 22656
    GROUP BY t.OwnerUserId;

    SELECT 
        @i = SUM(t.Score)
    FROM @t AS t
    GROUP BY t.OwnerUserId;

END;
GO

Hot Suet


So like, the first thing I did was use SQL Query Stress to run this on a bunch of threads, and I didn’t see any tempdb contention.

So that’s cool. But now you have a bunch of stuff taking up space in memory. Precious memory. Do you have enough memory for all this?

Marinate on that.

Well, okay. Surely they must improve on all of the issues with table variables in some other way:

  • Modifications can’t go parallel
  • Bad estimates
  • No column level stats

But, nope. No they don’t. It’s the same crap.

Minus the tempdb contetion.

Plus taking up space in memory.

But 2019


SQL Server 2019 does offer the same table level cardinality estimate for in-memory table variables as regular table variables.

If we flip database compatibility levels to 150, deferred compilation kicks in. Great. Are you on SQL Server 2019? Are you using compatibility level 150?

Don’t get too excited.

Let’s give this a test run in compat level 140:

DECLARE @i INT = 22656;
EXEC dbo.TableVariableTest @Id = @i;
SQL Server Query Plan
everything counts in large amounts

Switching over to compat level 150:

SQL Server Query Plan
yeaaahhhhh

Candy Girl


So what do memory optimized table variables solve?

Not the problem that table variables in general cause.

They do help you avoid tempdb contention, but you trade that off for them taking up space in memory.

Precious memory.

Do you have enough memory?

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Tracking Row Changes With Temporal Columns In SQL Server

Extra, Extra


Temporal tables are cool, but keeping all that history can be stressful. Change Tracking adds overhead to every transaction, and requires Snapshot Isolation to be successful. Change Data Capture can also run into problems scanning the transaction log, depending on transaction volume, etc.

Change Data Capture is also a little unfortunate in that it doesn’t track schema changes like adding or dropping columns, or changing column data types. Change Tracking doesn’t either, it just tracks the keys of what changed when data is modified.

Temporal Tables are different, because it will add new, and drop removed columns from the history table as they change in the base table. It’ll also push data type changes across. But the history table will not have the various permutations tracked. It just mirrors current state.

Just The Columns, Ma’am


Let’s say your needs aren’t robust enough to need any one of those highly specialized features, or even triggers to move data around when it changes.

You’re perfectly free and able to add the tracking columns that temporal tables use to your base tables, but adding them is far from free. When I added them to the 17 million row Posts table, it took about 40 seconds. My laptop doesn’t suck, either.

To show you a little how it works, let’s create a copy of the Votes table from Stack Overflow.

CREATE TABLE dbo.Votes_Tracked
(
    Id int NOT NULL,
    PostId int NOT NULL,
    UserId int NULL,
    BountyAmount int NULL,
    VoteTypeId int NOT NULL,
    CreationDate datetime NOT NULL,
    LastModified datetime2 GENERATED ALWAYS AS ROW start NOT NULL
        CONSTRAINT DF_LastModified DEFAULT (SYSDATETIME()),
    JunkDate datetime2 GENERATED ALWAYS AS ROW END HIDDEN NOT NULL
        CONSTRAINT DF_JunkDate DEFAULT ('9999-12-31 23:59:59.9999999'),
    PERIOD FOR SYSTEM_TIME (LastModified, JunkDate),
    CONSTRAINT PK_Votes_Id
        PRIMARY KEY CLUSTERED (Id ASC)
);
GO

Note that you need two columns to define the “period for system time”, and one of them will always be useless. That’s why I called it JunkDate, and not, like, whatever. But the good news is you can define that column as HIDDEN so that it doesn’t show up in all your queries.

Now we can stick some data in there and see how it works.

INSERT 
    dbo.Votes_Tracked WITH (TABLOCK)
(
    Id,
    PostId,
    UserId,
    BountyAmount,
    VoteTypeId,
    CreationDate
)
SELECT
    Id,
    PostId,
    UserId,
    BountyAmount,
    VoteTypeId,
    CreationDate
FROM StackOverflow2013.dbo.Votes AS v
WHERE v.CreationDate >= '20130101';

Looking Around


The table data looks like this:

woah man

If we run an update:

UPDATE v
    SET v.BountyAmount = 9999
FROM dbo.Votes_Tracked AS v
WHERE v.VoteTypeId = 7;
temporally yours

Note that these screen caps were taken without the HIDDEN keyword added to the table definition: that was an after thought recommended by my friend Peter.

Whatabouts?


Of course, if you remove rows from the table, they’re just gone. You’d still need a trigger to cover deletes, if you need to track those.

And if you want to remove those columns later, it takes a little bit of tweaking.

ALTER TABLE dbo.Votes_Tracked DROP CONSTRAINT DF_LastModified, DF_JunkDate;
ALTER TABLE dbo.Votes_Tracked DROP COLUMN JunkDate, LastModified;

Msg 13588, Level 16, State 1, Line 63
Column 'JunkDate' in table 'Crap.dbo.Votes_Tracked' cannot be dropped because it is a part of period definition.

Of course, the table isn’t system versioned, so this command will also fail:

ALTER TABLE dbo.Votes_Tracked SET (SYSTEM_VERSIONING = OFF);

Msg 13591, Level 16, State 1, Line 66
SYSTEM_VERSIONING is not turned ON for table 'Crap.dbo.Votes_Tracked'.

If you want to remove them, you’ll need to use this:

ALTER TABLE dbo.Votes_Tracked DROP PERIOD FOR SYSTEM_TIME;

Now you’ll be able to remove them.

Is This A Good Idea?


Well, it depends on what you need. If you just need to know when a row changed, and you don’t need to know what changed or who changed it, it can work in a more automated way than triggers. It does require an additional column, which isn’t ideal, but it’s not a gigantic column, and you don’t need to worry about indexing it because the data is junk.

I haven’t found any ridiculous downsides to this, but I’ll keep this updated if I do.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Don’t Be Afraid Of tempdb In SQL Server

It’s There For A Reason


When tuning queries, one pattern I see over and over again is people running crazy-long queries. Maybe they worked well-enough at some point, but over the years they just kept getting slower and slower.

Sometimes there are comments, and other times there’s enough domain knowledge on the call to understand how a query ended up in the shape it’s in. One persistent idea is that tempdb is something to be avoided. Either because it was “slow” or to avoid contention.

Granted, if a query has been around long enough, these may have been valid concerns at some point. In general though, temp tables (the # kind, not the @ kind) can be quite useful when query tuning.

You Might Be Using It, Anyway


Even if there’s some rule against directly using temp tables, queries can end up using tempdb by the caseload anyway.

Consider that Spool operators explicitly execute in tempdb, any spills will go to tempdb, and work tables that are used in a number of circumstances occur in tempdb. The bigger and more complicated your queries are, the more likely you are to run into cases where the optimizer Spools, Spills, or use some other workspace area in tempdb in your query plan.

Worse, optimizations available for temp tables aren’t available to on-the-fly operators. You also lose the ability to take further action by indexing your temp tables, etc.

It’s Often Easier Than Other Options


Many times when tuning queries, I’ll be puzzled by the optimizer’s choices. Sometimes it’s join type, other times it’s join order, or something else. Perhaps the most common reason is some misestimation, of course.

Query and index hints are great to experiment with, but are often unsatisfying as permanent fixes. I’m not saying to never use them, but you should explore other options first. In other words, keep temp tables on the table.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.