SQL Troubleshooting: Searching the Plan Cache for Missing Indexes and Execution Statistics
I've always found basic administration of SQL Server to be fairly straight-forward, but for a long while, I felt very uneasy about troubleshooting SQL Server performance issues. Any DBA knows that the most productive periods of learning arise when there are problems. For the past few months, I've been trying to track down a problem that I believe is hardware-related. Instead of keeping all my eggs in one basket, I've also been combing through DMV data to optimize the current database code.
One of the first things many people will look for are missing indexes. There are some great DMVs that will show you what indexes are suggested, but I wanted to know which queries those indexes were meant to help. I didn't want to create an index on a table if the query it's assisting runs infrequently or doesn't have a direct benefit to a user. I decided to search the internet for suggestions on how to comb through the Plan Cache for query plans reporting a missing index.
I immediately found a script floating around to do just what I wanted. I'm not sure who the original author of this code is, but I thank them for helping me on my quest. The script was able to run through the plan cache and report to me the query text and the tables needing the index. However, I also wanted to know what stored procedures were responsible. Having the direct piece of code is helpful to know where to focus, but if it is being called by a stored procedure, it's not helpful to know which one. Luckily, adding the required data was accomplished fairly easily by including more columns that were available in the DMVs.
It is important to set your query window to the context of the database you want to look at, so that the Object ID and Name resolutions will work properly. Any matching queries that are ad-hoc or not tied back to a stored procedure will just show NULL for these values. I'm also inserting the results into a table in MSDB instead of using a temp table like the original code. Depending on the size of your plan cache, this may take quite a while to run. My plan cache is 20GB and takes 30 minutes. I run it at night, but I have not observed any ill-effect on the database. In this example, I've limited the query plans to those with an execution count > 10. This removes the vast majority of the plans that are frequently recompiled. You will likely want to see everything at least once, so you can comment out the
WHERE execution_count >10 as needed.
1USE [dbName] 2GO 3 4WITH XMLNAMESPACES 5 (DEFAULT 'https://schemas.microsoft.com/sqlserver/2004/07/showplan') 6 7SELECT query_plan, 8 objectId, -- Stored Procedure Object ID (Added) 9 OBJECT_NAME(objectid) AS objectName, -- Stored Procedure Name (Added) 10 n.value('(@StatementText)', 'VARCHAR(4000)') AS sql_text, 11 n.value('(//MissingIndexGroup/@Impact)', 'FLOAT') AS impact, 12 DB_ID(REPLACE(REPLACE(n.value('(//MissingIndex/@Database)', 'VARCHAR(128)'),'[',''),']','')) AS database_id, 13 OBJECT_ID(n.value('(//MissingIndex/@Database)', 'VARCHAR(128)') + '.' + 14 n.value('(//MissingIndex/@Schema)', 'VARCHAR(128)') + '.' + 15 n.value('(//MissingIndex/@Table)', 'VARCHAR(128)')) AS OBJECT_ID, 16 n.value('(//MissingIndex/@Database)', 'VARCHAR(128)') + '.' + 17 n.value('(//MissingIndex/@Schema)', 'VARCHAR(128)') + '.' + 18 n.value('(//MissingIndex/@Table)', 'VARCHAR(128)') 19 AS statement, 20 ( SELECT DISTINCT c.value('(@Name)', 'VARCHAR(128)') + ', ' 21 FROM n.nodes('//ColumnGroup') AS t(cg) 22 CROSS APPLY cg.nodes('Column') AS r(c) 23 WHERE cg.value('(@Usage)', 'VARCHAR(128)') = 'EQUALITY' 24 FOR XML PATH('') 25 ) AS equality_columns, 26 ( SELECT DISTINCT c.value('(@Name)', 'VARCHAR(128)') + ', ' 27 FROM n.nodes('//ColumnGroup') AS t(cg) 28 CROSS APPLY cg.nodes('Column') AS r(c) 29 WHERE cg.value('(@Usage)', 'VARCHAR(128)') = 'INEQUALITY' 30 FOR XML PATH('') 31 ) AS inequality_columns, 32 ( SELECT DISTINCT c.value('(@Name)', 'VARCHAR(128)') + ', ' 33 FROM n.nodes('//ColumnGroup') AS t(cg) 34 CROSS APPLY cg.nodes('Column') AS r(c) 35 WHERE cg.value('(@Usage)', 'VARCHAR(128)') = 'INCLUDE' 36 FOR XML PATH('') 37 ) AS include_columns 38INTO msdb.dbo.Missing_Indexes 39FROM 40( 41 SELECT query_plan, objectid 42 FROM ( 43 SELECT DISTINCT plan_handle 44 FROM sys.dm_exec_query_stats WITH(NOLOCK) WHERE execution_count > 10 -- Limit the plans as necessary to reduce run times (Added). 45 ) AS qs 46 OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp 47 WHERE tp.query_plan.exist('//MissingIndex')=1 48) AS tab --(query_plan) 49CROSS APPLY query_plan.nodes('//StmtSimple') AS q(n) 50WHERE n.exist('QueryPlan/MissingIndexes') = 1 51 52-- Trim trailing comma from lists 53UPDATE msdb.dbo.Missing_Indexes 54SET equality_columns = LEFT(equality_columns,LEN(equality_columns)-1), 55 inequality_columns = LEFT(inequality_columns,LEN(inequality_columns)-1), 56 include_columns = LEFT(include_columns,LEN(include_columns)-1)
I'm using a second query to join the missing index information back to the query execution information, instead of trying to handle it all at once. This just allows me to keep it a bit more modular and avoid making the code even more complicated than it is.
At this point, you can query the Missing_Indexes table as you wish, but you will need the following query to join it to the stats DMV.
1SELECT DISTINCT 2 ( ps.total_worker_time / ps.execution_count ) / 1000 AS AvgMS , 3 ps.max_worker_time / 1000 AS MaxMS , 4 ps.execution_count , 5 mi.objectid , 6 mi.objectName , /*mi.sql_text,*/ 7 mi.impact , 8 mi.object_id , 9 mi.statement , 10 mi.equality_columns , 11 mi.inequality_columns , 12 mi.include_columns 13FROM [msdb].[dbo].[Missing_Indexes] mi 14 JOIN works.sys.dm_exec_procedure_stats ps ON mi.objectid = ps.object_id 15WHERE ( ( ps.total_worker_time / ps.execution_count ) / 1000 > 100 16 AND ps.max_worker_time / 1000 > 100 17 ) 18ORDER BY avgms DESC
There are obviously a number of ways you can run this second query, depending on your needs. I found it useful to remove the SQL_Text column and use a DISTINCT since there are likely to be a number of results. In my example, I limited the queries to those with an average execution time of > 100ms and also the max execution time of > 100ms. If dealing with a large amount of information, I would rather focus on the queries that take longer to run, but even fast-running queries can be tuned if they execute a large amount of times.
So by now you have a nice little list of stored procedures, and maybe some ad-hoc queries, that might need some investigation. Feel free to expand your filtering to exclude the ones with a lower impact or if you feel that the execution counts are not high enough to worry about. With this new technique, and the use of these helpful DMVs, I am much better suited to finding performance issues quickly.
As an example, a few days ago, a user emailed me with the infamous "hey the system is slow sometimes" email. Instead of letting them know that their information is less than helpful, I decided to just query the execution stats for the slowest running procedures. I quickly found the culprit -- a stored procedure that averaged 9 seconds to run on a particular screen of the application. I was then able to see that it had a missing index, which I suggested to the developers.