Documentation for dm_db_missing_index_group_stats_query

No, It’s New


When I was checking out early builds of SQL Server 2019, I noticed a new DMV called dm_db_missing_index_group_stats_query, that I thought was pretty cool.

It helped you tie missing index requests to the queries that requested them. Previously, that took a whole lot of heroic effort, or luck.

With this new DMV, it’s possible to combine queries that look for missing indexes with queries that look for tuning opportunities in the plan cache or in Query Store.

It seems to tie back to dm_db_missing_index_groups, on the index_group_handle column in this DMV joined to the group handle column in the new DMV.

If you’re wondering why I’m not giving you any code samples here, it’s because I’m going to get some stuff built into sp_BlitzIndex to take advantage of it, now that it’s documented.

Special thanks to William Assaf (b|t) for helping to get this done.

Thanks for reading!

Why Microsoft Should Get Rid Of The Plan Cache And Use Query Store Instead

Used To Love Her


I used to think the plan cache was so cool.

  • You can find queries that aren’t good there
  • Plans are full of details (and XML)
  • Supporting DMVs give you extra insights about resource usage

But most of the time now, I’m totally frustrated with it.

It clears out a lot, plans aren’t there for some queries, and the plans that are there can be very misleading.

Can you really tell someone what their worst performing queries are when everything in there is from the last 15 minutes?

No.

Why I Like Query Store Better


Plans end up there for more definite amounts of time, even if you stick a recompile hint on them.

Plus, you get a whole lot more information about what went on with all the different query metrics. As far as a source of “what happened and when”, it’s much more reliable. It’s almost like having a monitoring tool in SQL Server.

Do you trust your plan cache to tell you what was slow more than a couple days ago?

One really big bonus is that it’s portable, and it’s per-database. You can backup or clone a database so someone else can easily look at it somewhere else.

On top of that, it’s more usable: there are built in reports, a graphical interface, settings you can tweak, and plans you can force with a couple clicks.

You could solve a lot of frustrations by moving things here.

Microsoft Has Work To Do, Though


There have been a number of issues, and unanswered user requests for Query Store over the years, and not all of the associated DMVs have full parity.

Querying the data also needs to be made less prone to performance issues. And, living in the future, it’s a much more sensible location for the last known actual plan.

A nice to have would be predicate normalization for identical and morally equivalent queries, with totals and averages for resource usage, so you could group things together. Especially for morally equivalent queries, this would allow plan forcing where the queries get different ids.

Some monitoring tools have this in their interface, and the number of times I’ve been able to track down outliers within a group of executions to solve a problem is, well, at least memorable to me.

I wish more people had monitoring tools.

Thanks for reading!

For Better Query Store Adoption, Make Querying It Easier

Mama, I Tried


Back when I first wrote sp_BlitzQueryStore, I was totally enamored with Query Store.

Like the plan cache, but better. History. Different plans for the same query. Not disturbed by memory pressure or restarts.

Then I waited patiently to find a client on 2016 using it.

And waited, and waited, and waited.

And finally, some came along.

Slow Pokes And No Pokes


When I ran it, it took forever. Not even the XML part. The XML part was fast.

Gathering the initial set of data was slow.

With some time to experiment and dig in, I found that the IN_MEM tables cause significant performance issues when:

  • Query Store is actively logging data
  • Query Store is > 25 MB or so

Yes, children, in memory tables can be slow, too.

The Problem


Let’s take a couple simple queries against Query Store tables:

SELECT TOP 10 *
FROM sys.query_store_runtime_stats AS qsrs
WHERE qsrs.avg_cpu_time >= 500000
AND   qsrs.last_execution_time >= DATEADD(DAY, -1, GETDATE())
ORDER BY qsrs.avg_cpu_time DESC;

SELECT TOP 10 *
FROM sys.query_store_plan AS qsp
WHERE qsp.query_plan IS NOT NULL
AND   qsp.last_execution_time >= DATEADD(DAY, -1, GETDATE())
ORDER BY qsp.last_execution_time DESC;

The first query runs for 10 seconds, with the entirety of the time spent filtering data out of the IN_MEM table:

Ho hum.

The second query is even worse, at nearly 2 minutes:

Filtering on the 1
Fingerling on the floor

“Unrealistic”


I know, this configuration is probably unsupported because I used SELECT * or something.

I wrote this query hoping to quickly get the worst plans by a specific metric.

WITH the_pits
    AS
     (
         SELECT   TOP ( 101 )
                  qsrs.plan_id,
                  qsp.query_id,
                  qsrs.avg_duration / 100000. AS avg_duration_s,
                  qsrs.avg_cpu_time / 100000. AS avg_cpu_time_s,
                  qsrs.avg_query_max_used_memory,
                  qsrs.avg_logical_io_reads,
                  qsrs.avg_logical_io_writes,
                  qsrs.avg_tempdb_space_used,
                  qsrs.last_execution_time,
                  /*
                  You can stick any of the above metrics in here to
                  find offenders by different resource abuse
                  */
                  MAX(qsrs.avg_cpu_time) OVER
                  ( 
                      PARTITION BY 
                          qsp.query_id 
                      ORDER BY
                          qsp.query_id
                      ROWS UNBOUNDED PRECEDING
                  ) AS n
         FROM     sys.query_store_runtime_stats AS qsrs
         JOIN     sys.query_store_plan AS qsp
             ON qsp.plan_id = qsrs.plan_id
         WHERE    qsrs.avg_duration >= ( 5000. * 1000. )
         AND      qsrs.avg_cpu_time >= ( 1000. * 1000. )
         AND      qsrs.last_execution_time >= DATEADD(DAY, -7, GETDATE())
         AND      qsp.query_plan IS NOT NULL
         /*
         Don't forget to change this to same thing!
         */
         ORDER BY qsrs.avg_cpu_time DESC
     )
SELECT   p.plan_id,
         p.query_id,
         p.avg_duration_s,
         p.avg_cpu_time_s,
         p.avg_query_max_used_memory,
         p.avg_logical_io_reads,
         p.avg_logical_io_writes,
         p.avg_tempdb_space_used,
         p.last_execution_time,
         qsqt.query_sql_text, 
         TRY_CONVERT(XML, qsp.query_plan) AS query_plan
FROM     sys.query_store_plan AS qsp
JOIN     the_pits AS p
    ON p.plan_id = qsp.plan_id
JOIN     sys.query_store_query AS qsq
    ON qsq.query_id = qsp.query_id
JOIN     sys.query_store_query_text AS qsqt
    ON qsq.query_text_id = qsqt.query_text_id
ORDER BY p.n DESC;

 

It works pretty well. Sometimes.

Other times, it runs for 4.5 minutes.

I know what you’re thinking: “Erik, you’re doing all sorts of crazy stuff in there. You’re making it slow.”

But none of the crazy stuff I’m doing is where the slowdown is.

It’s all in the same stuff I pointed out in the simpler queries.

12.5 seconds…
FOUR MINUTES

Testing, testing


I can’t stress how much I want Query Store to be successful. I absolutely love the idea.

But it just wasn’t implemented very well. Simple filtering against the data takes forever.

And yes, you can have NULL query plans for some reason. That’s rich.

🤷‍♂️🤷‍♀️🤷‍♂️🤷‍♀️🤷‍♂️🤷‍♀️🤷‍♂️🤷‍♀️🤷‍♂️

Usability issues don’t stop there. You can hit weird server performance issues, and reports are broken.

The irony of needing to tune queries so you can find queries to tune is ironic.

I’m nearly sure of it.

Thanks for reading!