How to view percent done for a long executing query

If you’ve ever wondered when a long running query was going to complete there is now a way to get some visibility into it.  Starting in SQL Server 2016 SP1 there is a lightweight query profiler that can be enabled with extended events or by enabling the 7412 trace flag.  When you enable this trace flag the sys.dm_exec_query_profiles DMV will start exposing information regarding details by node in an execution plan of where rows are being processed as well as logical IOs as a query executes.

First things first

For this to work you will need to enable the lightweight query profiler with trace flag 7412.  You should enable this globally.

Gotchas

If your system is not CPU constrained enabling this should probably not have any impact – the information I’ve read is this has about a 2% overhead with TPC benchmarks.  In SQL Server 2019 this will just be enabled by default.  I did find a KB that indicates 7412 potentially causes an issue with sys.dm_exec_query_statistics_xml, but in this example we don’t use the DMF.  I opened a ticket with Microsoft the other day and they confirmed this does not affect sys.dm_exec_query_profiles (which is what we use in this post).

How this works

To estimate percent done you can compare the estimate row counts versus the actual row counts for each node in an execution plan as the query executes.  This comes with some caveats of course.  If the estimate row counts are inaccurate so will the results of this query.  This might be an indicator of an inherent problem with how a query is structured or it might be an indicator of inaccurate statistics.  Additionally a long blocking (i.e. a lock wait) can throw this off.

To see what query is executing along with the progress we can join the dmv’s sys.dm_exec_requests, sys.dm_exec_query_profiles, and sys.dm_exec_sql_text, and sys.dm_exec_text_query_plan.  The result of this is the following query:

SELECT er.start_time, datediff(ms, er.start_time, getdate()) elapsed_time_ms,
er.session_id, er.wait_type, er.wait_time,
SUBSTRING(st.text, (er.statement_start_offset/2)+1,
((CASE er.statement_end_offset
WHEN -1 THEN DATALENGTH(st.text)
ELSE er.statement_end_offset
END - er.statement_start_offset)/2) + 1) AS statement_text,
CONVERT(XML, qp.query_plan) query_plan,
CASE WHEN sum(eqp.estimate_row_count) > 0 THEN
100 * SUM(eqp.row_count)/SUM(eqp.estimate_row_count)
ELSE
NULL
END AS percent_done
FROM sys.dm_exec_requests er
INNER JOIN sys.dm_exec_query_profiles eqp ON er.session_id = eqp.session_id
AND er.request_id = eqp.request_id
CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) st
CROSS APPLY sys.dm_exec_text_query_plan(er.plan_handle, er.statement_start_offset, er.statement_end_offset) qp
WHERE er.session_id > 50 and er.session_id != @@spid
GROUP BY er.start_time, er.session_id, er.wait_type, er.wait_time, st.text,
er.statement_start_offset, er.statement_end_offset, qp.query_plan
HAVING SUM(eqp.estimate_row_count) > 0

Here’s the example output of this query:

Pretty straightforward.  I included the wait_type and wait_time columns.  These come in handy if you are polling and see some long lock waits.  Long log waits might throw this approach off, but some shorter waits such at pageiolatch (IO waits) spread out through the run will to some extent factor into the percent_done estimate correctly.

Here is the output where we have some pretty inaccurate values for percent_done:

This can sometimes be a clue that there might be some tuning opportunities buried in the plan or some pretty inaccurate statistics somewhere.  In the example query I was using one query used #temp tables as opposed to @table variables.  @table variables typically have really bad row estimates so as a result it throws of the calculated percent_done value.

Conclusion

Although it is not perfect, it is definitely more visibility into what a long running query is doing.  Keep in mind the caveats I mentioned, but also make note that if this approach is wildly off it might give you some areas of where to focus your tuning efforts in an execution plan.

 

SQLGrease provides deep dive visibility into where queries are spending time not just with wait events, but by recording query plan statistics down to the busiest nodes in an execution plan.  SQLGrease has free, trial, and pay versions – all which offer full functionality.

 

 

 

One thought on “How to view percent done for a long executing query

Comments are closed.