Are Connection Leaks Causing You Timeouts in SQL Server?

Do you ever have .Net applications reporting they are getting connection timeouts to SQL Server:

Message=Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.

This usually will occur after a 15 second hang.  There are three scenarios that I have seen cause this error:

  1. Connection leaks
  2. Spikes in query response times causing connection pools to fill
  3. External memory pressure on SQL Server causing it to page out

Most of the time I’ve seen these errors occur as a result of connection leaks in the application.  If you suspect you might be having this issue, you might want to run the following:

select count(1) conection_count, program_name, host_name, status, host_process_id
from sys.dm_exec_sessions where is_user_process = 1
group by program_name, host_name, status, host_process_id order by 1 desc
This will return a list of connections being used.  You can see that there are 100 sleeping connections coming from a .Net application.  The value of 100 has some significance – by default the ADO.Net driver sets a max pool size to 100 connections per process.  If you see a particular process (i.e. host_process_id 5172) with 100 total connections, and a majority of them are sleeping there’s a possibility it might have a leak – you might have a leak with even less than 100 connections though.  Keep in mind, this query returns a list of host_names that should probably be checked – it is not definitive at this point that you have a leak.

To validate for certain if you have connection leaks, you should log in to the application server(s) and launch perfmon to view the .Net Data Provider For SqlServer/NumberOfReclaimedConnections This counter is cumulative since the process started.  If you see a value greater than 0 for any process, this indicates you have a connection leak. In the chart above you can see the number of reclaimed connections is steadily climbing.  This indicates that the .Net garbage collector is closing connections rather than the application doing this in the code properly.  The result of this can be connection errors as noted at the beginning of this post, or it can be sporadic hangs (under 15 seconds) when opening connections.  The severity of this depends on how busy your application is and the frequency of the leaked connections.

 

Watch us diagnose a query timeout with SQLGrease:

See how you can get query level wait events without Query Store:

Free you say?  How can that be?