Well Have You Ever
I mean ever really wondered just how long a query of yours waited on stats to automatically update before running?
Have you ever been absolutely, positively sure that those gosh dang darn stats updates were putting your query on the trolley to Slowsville?
Your wishes will be 100% granted in SQL Server 2019.
Now, this wait doesn’t show up when stats are created the first time.
So if you run a query with a where clause on a column that doesn’t already have stats, the initial gather won’t show.
This is only for updates. Not creates.
The following script will take the Votes table, and make a copy of it called Vetos.
Then it’ll create a PK/CX (for some reason?), and run a query against a couple columns that are mostly NULL.
Voting data gets cleaned out of the SO data dump.
SELECT ISNULL(Id, 0) AS Id, PostId, UserId, BountyAmount, VoteTypeId, CreationDate INTO dbo.Vetos FROM dbo.Votes; ALTER TABLE dbo.Vetos ADD CONSTRAINT PK_Vetos_Id PRIMARY KEY CLUSTERED(Id); SELECT TOP 10 * FROM dbo.Vetos WHERE UserId > 0 AND BountyAmount > 0;
The last query is important because it generates the initial stats on both of those columns.
Now let’s put some work into it!
UPDATE v SET v.BountyAmount = 50000 FROM dbo.Vetos AS v WHERE v.BountyAmount IS NULL; UPDATE v SET v.UserId = v.VoteTypeId FROM dbo.Vetos AS v WHERE v.UserId IS NULL;
This table has 52,928,720 rows in it. Not the biggest, but a decent size to maybe have to wait on stats to update.
In separate windows, I’ll run these:
SELECT COUNT(*) AS records FROM dbo.Vetos AS v WHERE v.BountyAmount > 500; SELECT COUNT(*) AS records FROM dbo.Vetos AS v WHERE v.UserId < 16;
They’ll trigger the stats refresh.
Checking in on each session’s wait stats using dm_exec_session_wait_stats, our wild wait appears.
So there you have it. 52 million row stats refreshes take about half a second.
That wasn’t very exciting. Let’s try something else.
If we start from scratch, but instead of letting SQL Server create stats automatically by running a query, let’s create statistics with some funny options, and then update the columns.
CREATE STATISTICS s_b ON dbo.Vetos(BountyAmount) WITH FULLSCAN, PERSIST_SAMPLE_PERCENT = ON; CREATE STATISTICS s_u ON dbo.Vetos(UserId) WITH FULLSCAN, PERSIST_SAMPLE_PERCENT = ON;
Now, when we run our select queries, things really slow down.
Quite a bit longer on those.
What would one do if they encountered 15-16 waits on this in real life?
Well, you have some options.
- Update stats asynchronously
- Create stats with no recompute and handle stats updates yourself
- Update statistics more often than you currently do, trying to stay ahead of automatic updates
It’s hard to see this being a really big issue outside of very large tables, and perhaps only on under-powered servers.
Or if someone created statistics with some rather impolite settings.
Thanks for reading!