Starting SQL: Measuring Wait Stats In SQL Server

Done Started Something

If you talk to anyone who bills by the hour for SQL Server, one of the first things they’ll ask you is what your wait stats are.

It’s not a bad question; it kills some time, and it lets them gauge how engaged you are with your server.

Waits can sometimes indicate a severe bottleneck
Waits can also be “high” on a server that has perfectly acceptable throughput

There are also some problems with wait stats, and performance counters in general: there’s no direct correlation between the counters and the queries that caused them.

Sure, Query Store in 2017+ tracks high level wait stats for queries, but I don’t run into a lot of people using Query Store still.

The other trouble with performance counters is that they just aggregate over time, with no partitioning either by database or time.

Brooklyn Basement

Some things to keep in mind when you’re looking at wait stats:

How they compare to server uptime
How many times the wait occurred
How long each occurrence took to complete on average

If your script or monitoring tool doesn’t report this stuff, it’s leaving some really important stuff out of the picture.

The relationship to server up-time is important, because if all you have to work off of is percentages, you have no idea if your server is constantly grinding on these waits, or if they only happened for a few minutes.

As a thought experiment: say 100% of your waits are on PAGEIOLATCH_SH, and you have 2 hours total of those waits.

How does your opinion of those waits change if the server has been up for:

1 hour
4 hours
24 hours
96 hours

Typically, the longer a server has been up, and the less concentrated waits become, the less of a focal point they should be. There’s no magic ratio here, but for me, the closer wait times are to up time, the more I may want to look at them.

Longs and Shorts

Yes, there can be a mix of these, but there’s still two basic ways that wait stats rack up.

Lots of waits that happen pretty quickly
Fewer waits that last a long time

Obviously, the source of those waits will impact how much we care about them. Here are some common examples:

You wait often on storage, but the waits are short (say <10 ms)
You don’t wait often on locks, but the waits are long (say > 10,000 ms)

It’s pretty trivial to look at your wait stats

Running queries store them in sys.dm_os_waiting_tasks
In 2016+, session waits are in sys.dm_exec_session_wait_stats
Completed waits are stored in sys.dm_os_wait_stats

I’m not gonna reinvent any wheels here, but I will reiterate a couple things:

You can see which queries are generating waits while they’re running, but…
If you’re not constantly logging that, it disappears on you
Once waits are completed, they are faceless and blameless
Waits in general can be far less valuable when workloads are inconsistent

Attentione!

Tomorrow, I’ll talk through some of the most common waits, and first-line ways to start solving them.

They may not solve every single problem, but they’re generally a good place to start.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Done Started Something

Brooklyn Basement

Longs and Shorts

Attentione!

Going Further

Related Posts