How to fix RESOURCE_SEMAPHORE waits

RESOURCE_SEMAPHORE waits occur when a query is unable to get enough memory to execute.  In this post I will describe how to see they are occurring.  I will also provide tips on ways to help reduce or eliminate them.

How do I see them occurring?

You can detect they are occurring by checking wait stats.  If they are currently occurring, you can query sys.dm_exec_query_memory_grants.  If you are on SQL 2017 or Azure SQL Database, you can use query store and look for waits in the category of “Memory”.  You can do this by querying sys.query_store_wait_stats.  You can also set up an extended event to detect them and log them to a  file target.  In this example I will take the approach of finding them with sys.dm_exec_query_memory_grants – this should work from SQL 2008 on forward.

By using the following query, you can find which sessions/queries are waiting to get memory, and which sessions/queries are consuming memory.

SELECT er.session_id,
SUBSTRING(st.text, (er.statement_start_offset/2)+1,
((CASE er.statement_end_offset
WHEN -1 THEN DATALENGTH(st.text)
ELSE er.statement_end_offset
END - er.statement_start_offset)/2) + 1) AS statement_text,
er.wait_type, er.wait_time,
mg.requested_memory_kb/1024 AS requested_memory_MB,
mg.granted_memory_kb/1024 AS granted_memory_MB,
CONVERT(XML, qp.query_plan) AS query_plan
FROM sys.dm_exec_requests er CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) st
CROSS APPLY sys.dm_exec_text_query_plan(er.plan_handle, er.statement_start_offset, er.statement_end_offset) qp
INNER JOIN sys.dm_exec_query_memory_grants mg ON er.session_id = mg.session_id
ORDER BY mg.requested_memory_kb DESC

The output of this is listed below.

The output of this query is sorted by the amount of requested_memory_MB descending.  This allows us to see which queries are attempting to consume the most memory.  You can see that sessions 67,55, and 68 are experiencing RESOURCE_SEMAPHORE waits.  This is the result of them waiting to get memory granted to them.  Sessions 62, 63, 65, 58, and 54 have the memory they requested granted to them so are not in RESOURCE_SEMAPHORE waits anymore.  The query requesting the most memory is the same query being executed on sessions 67,55,62, and 63.  This is the query that we should focus on tuning.

How do I fix RESOURCE_SEMAPHORE waits?

Typically RESOURCE_SEMAPHORE waits occur as a result of an inefficient execution plan.  There’s a lot of different things that may cause this.

Inefficient execution plans pop up as the result of a change in an existing query’s plan or they may appear for new queries that need tuning (i.e. adhoc queries, new code releases).

From looking at our example above, we are able to pull up the execution plan of the query requiring a large memory footprint.

I’ll go through some things I typically see that cause a query to consume more memory than it should if it were properly tuned.

Query Store

In the case that an existing query’s plan changed you can look for a plan regression in query store.  Unfortunately this only applies to you if you are on SQL 2016 or later (or have a good performance monitoring tool – just be sure it captures current and previous execution plans).

If you find that the execution plan changed around the time RESOURCE_SEMAPHORE waits appeared, it is likely the cause.  You might want to consider forcing the old execution plan; however, you should continue reading since we will cover some of the reasons the plan might have changed.

Fragmentation

A common thing I see happen is queries stop using perfectly good indexes at some point and start doing scans as a result of fragmentation.  If you had a previously healthy system, and with no change in workload started seeing queries performing large scans despite having proper indexes, I’d suggest checking fragmentation for tables used in the query.  High fragmentation might make the optimizer start scanning and/or taking a parallel plan when not necessary.

MAXDOP

I have seen on many occasions where a query changing execution plans from non parallel to parallel will increase the amount of memory required by a query.  If you see a query that previously took a non parallel plan, and with the appearance of RESOURCE_SEMAPHORE waits is taking a parallel plan, there is a good chance this is the cause of your waits.  Most plans that execute in parallel are doing large scans and sorts.

You might experiment with a MAXDOP hint of 1 to reduce the memory requirements of the query – this may cause the plan to seek on indexes (which generally reduces memory footprint of a query).  You can also experiment with lower MAXDOP settings than the system defined setting.  If you have a small number of queries that are problematic I would address this via MAXDOP hints on the queries rather than changing system wide MAXDOP settings.

Missing Indexes

Look for missing indexes in the execution plan.  If you have large scans occurring as a result of missing indexes, this can lead to a higher memory footprint for the query.

Our example query had a missing index.  After applying the missing index we can see the query now has a much smaller memory foot print (9 MB now vs 183 MB before).  As a result we no longer see RESOURCE_SEMAPHORE waits.

 

Plan Warnings

Check the execution plan for any plan warnings.  Plan warnings appear in the execution plan with an exclamation point.  This can be things such as implicit type conversions, missing join predicates (AKA cartesian joins), or missing statistics.  I won’t go too deep into plan warnings in this post, but you might want to read this past post regarding implicit conversions.

SELECT *

If you select unnecessary columns in a query that are doing large scans and sorts this will contribute to to the memory required by a query.  Object relational mapping tools (ORM’s) tend to do this.  I’m not saying all ORM’s are bad; however, if you have untuned queries they can exaggerate the cost of them.

Parameter Sniffing

On occasion I see execution plan changes as the result of parameter sniffing cause RESOURCE_SEMAPHORE waits.  I won’t go too deep into parameter sniffing since that’s a whole topic in itself and there’s plenty that a quick search would turn up.

Add Memory

There’s a reason I mention this last.  I prefer to tune queries than throw hardware at a problem.  In many cases throwing hardware at a problem doesn’t fix it.  Additionally it costs more and rarely gives you better performance than doing some proper tuning.

Conclusion

RESOURCE_SEMAPHORE waits can kill performance on a  previously healthy system.  I generally consider them to be more of a tuning related issue rather than a hardware issue.  By identifying your largest memory consumers and addressing them you can avoid additional hardware while improving performance and stability.

 

Watch us diagnose a query timeout with SQLGrease:

See how you can get query level wait events without Query Store:

SQLGrease can help you get to the bottom of your RESOURCE_SEMAPHORE waits quickly.  SQLGrease has specific functionality that tracks them at the point they occur.  SQLGrease will also identify the queries and collect execution plans for the highest memory consuming queries whenever RESOURCE_SEMAPHORE waits occur.

3 thoughts on “How to fix RESOURCE_SEMAPHORE waits

  1. I don’t really buy the fragmentation theory.

    The optimiser doesn’t know or care about fragmentation. The only reason why it could affect the optimisation process is indirectly as the page count for an index increases.

    But then I can’t envisage a situation where having an index with a higher page count would cause a scan to be preferred to a seek.

    Are you able to produce a demo of this?

  2. I think fragmentation may cause the query to demand more memory during the time of execution, than initially requested, and it can get into RESOURCE_SEMAPHORE wait.
    I am talking just from the theory, and willing to learn from it.

    1. Hard to say for certain, but it could be an interesting experiment to test a query that requires a memory grant and see what the difference is when a table is highly fragmented vs not fragmented at all. You can do a rebuild to get rid of the fragmentation and test that. Another option is to play with fill factor (set it extremely low) to generate a fragmentation type scenario artificially. Generally where I see excessive memory grant issues it’s related to using the wrong data types, or doing converts to a datatype such as CONVERT(VARCHAR(MAX)…

Comments are closed.