If you’re looking for your worst performers by querying sys.dm_exec_query_stats, be aware queries that are timing out (i.e. “Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding”) do not factor into the statistics shown in sys.dm_exec_query_stats. I’ll show an example of this.
Example
For this example I will update a record which will cause another session to block. This update is wrapped in a begin transaction, but does not have a commit (we refer to these as developers with commitment problems).
begin transaction
update CustomerTest set ModifiedDate = getdate() where customerid = 1
Next I’ll clear the plan cache with DBCC FREEPROCCACHE. This makes sure we aren’t picking up any previously existing stats.
Next I’ll execute a statement that should hang (as a result of the update we performed above). This is executed on session 52.
update CustomerTest set ModifiedDate = getdate() where customerid = 1
From querying sys.dm_exec_requests for the hung session_id (52), we’re able to retrieve the sql_handle. We can use this sql_handle to query sys.dm_exec_query_stats.
As you can see there are no entries for this query/sql_handle in sys.dm_exec_query_stats. Note, the query is still executing at this point.
Now I will stop the executing query in SSMS (session 52). This has a similar effect to generating an “attention” event as a command timeout. We will check again for the sql_handle in sys.dm_exec_query_stats.
Still no entry exists. This proves that a query that times out does not get their runtime stats accounted for. Next I will perform the same test where I execute the query once successfully, and subsequently perform an execution that gets hung and I cancel.
Here’s the results after the successful execution:
Since the query did complete successfully, the runtime stats are accounted for in sys.dm_exec_query_stats. Here’s the results after we had a hung execution which we canceled:
Still no change from the prior execution. This proves that queries experiencing timeouts are not accounted for in sys.dm_exec_query_stats.
Conclusion
In real life situations, when a query goes south on a busy system to the extent that it is timing out, I do tend to see some increase in response times in dm_exec_query_stats. This generally happens as there is a brief period that the query is still completing prior to timeouts occurring. The true severity of the problem usually doesn’t fully surface as a result of what was described in this post.
Another great post.. Really, yours are uniformly excellent, and I always look forward to seeing a new one in my feed.
That said, I have a couple of quibbles with this one.
First, you say that doing the freeproccache will “makes sure we aren’t picking up any previously existing stats”. That’s not so. The next execution will indeed use stats, but that command ensures it won’t use any query plan that had been cached before the command was run.
Second, the poimt you make about this dmv not reflecting long-running requests that timeout is really valuable. Again, thanks for making it. But you don’t show how one might find THAT there have been such long-running requests, which can be helpful to readers trying to track down such issues.
Have you all done any posts about that? There are different ways people have used over the years, each with pros and cons, or whose availability may vary based on the sql server version.
If you want to see a query that timed out you can take a look at this post – https://blog.sqlgrease.com/how-to-detect-query-timeout-errors-with-extended-events/ however, this does not record runtime info (i.e. logical IO’s, cpu, etc) for the query. If you have query store enabled I would think that might give you some runtime info I need to validate that though. Your other option is third party monitoring tools (ahem SQLGrease). Thanks for kind comments.
Hi Charles, when I said stats I meant data in sys.dm_exec_query_stats (not the statistics used for generating an execution plan). If you look in sys.dm_exec_query_stats it contains cumulative runtime statistics that are up to date as of the creation_time. The reason I flushed the proc cache was to set the stats back to 0.
I’ve also found that the “execution based” DMVs can hide important things like orphaned connections with opened transactions. These processes can literally block other processes for days and never show up in the “execution based” DMVs because the aforementioned processes are not actually executing – they are waiting on a command.
I’ve found the best way to get a clear picture of system activity (without taking the sledgehammer approach of tracing or extended events) is to poll the “connection based” and “session based” DMVs and outer join the execution metrics.
Something to the effect of:
SELECT
…
FROM
sys.dm_exec_sessions as Sess (NOLOCK)
INNER JOIN sys.dm_exec_connections as Conn (NOLOCK)
ON Conn.session_id = Cast(Sess.session_id as Int)
LEFT OUTER JOIN sys.dm_exec_requests as Req (NOLOCK)
INNER JOIN sys.dm_os_tasks as Tsk (NOLOCK)
ON Req.session_id = Tsk.session_id
AND Req.request_id = Tsk.request_id
ON Sess.session_id = Req.session_id
LEFT OUTER JOIN sys.dm_exec_query_stats as execStat (NOLOCK)
ON execStat.sql_handle = Conn.most_recent_sql_handle
…