Are VARCHAR or CHAR Hurting Your Performance Due To Implicit Conversions?

A common issue I’ve seen with code developed in ADO.Net is parameterized SQL performing full scans as a result of implicit conversions (despite appropriate indexing).  This seems to occur most commonly if a database uses VARCHAR/CHAR datatypes as opposed to NVARCHAR/NCHAR datatypes.  I’ll explain later in this posting why VARCHAR and CHAR datatypes seem to be more susceptible to this.

Update: I wrote a future post where I show how to fix this with no code changes.  You can find this post here.

SQL Server will perform an implicit conversion when attempting to compare two datatypes that do not match.  I will use the Adventureworks2012 database in my examples.  For the sake of example, I will create a copy of the sales.Customer table and create some indexes on it.

USE AdventureWorks2012
CREATE TABLE [CustomerTest]
([CustomerID] [int],
[PersonID] [int] NULL,
[StoreID] [int] NULL,
[TerritoryID] [int] NULL,
[AccountNumber]  varchar(10),
[rowguid] [uniqueidentifier] ROWGUIDCOL  NOTNULL,
[ModifiedDate] [datetime] NOTNULL,
PRIMARYKEY([CustomerID]))
INSERT INTO CustomerTest(CustomerID, PersonID, StoreID, TerritoryID, AccountNumber, rowguid, ModifiedDate)
SELECT CustomerID, PersonID, StoreID, TerritoryID, AccountNumber, rowguid, ModifiedDate FROM sales.Customer
CREATE INDEX IX_Customer_AccountNumber ON CustomerTest(AccountNumber)

Now that we have our test table created (CustomerTest) we can run a query against it to show what happens when an implicit conversion does not occur.

SELECT * FROM CustomerTest WHERE AccountNumber = 'AW00000011'

 

You can see that an index seek is performed against the index on AccountNumber.  This is the desired behavior.

If we make a slight modification to the query (specify the AccountNumber to be an N type), the execution plan changes.

SELECT * FROM CustomerTest WHERE AccountNumber = N'AW00000011'

 

By changing the AccountNumber to an N type, an index scan is now occurring despite a perfectly usable index existing on this column.  This is happening because SQL Server must compare like datatypes.  NVARCHAR(10) is not the same as VARCHAR(10), as a result a conversion must occur.  You can see this when you mouseover the Index Scan in the execution plan.

I’m not explicitly setting the N type in my query, so why does this example matter?

In the above scenario we purposely caused a problem by specifying a mismatched data type in the query.  So, what does this have to do with ADO.NET?  By default, ADO.Net infers what data types to pass to parameterized SQL based on the .Net data type.  There is not a one to one correlation between .Net data types and SQL Server data types.  In .Net, VARCHAR, and NVARCHAR values both map to a .Net string data type.  When ADO.Net tries to pass a .Net string to a parameterized query, it makes the assumption that it is passing an N type.  You can see this when running a SQL profiler trace that an N type is passed for @AccountNumber:
exec sp_executesql N'SELECT * FROM CustomerTest WHERE AccountNumber = @AccountNumber',N'@AccountNumber nvarchar(10)',@AccountNumber=N'AW00000011'

As a result, you will find that indexes on VARCHAR or CHAR columns which would otherwise be seeked on are getting full scans instead.

Those damn developers

The wrong data type is a result of not explicitly setting the type.  Below is a typical example of what the .Net code (c#) that does not explicitly set the data type looks like.

SqlCommand command = new SqlCommand()
command.CommandText = "SELECT * FROM CustomerTest WHERE AccountNumber = @AccountNumber";
command.Parameters.Add(new SqlParameter("@AccountNumber","AW00000011"));

In order to fix this, the .Net code must specify the data type to be passed rather than letting ADO.Net infer the data type.  The following example shows what this would look like.

SqlCommand command = new SqlCommand()
SqlParameter parameter = new SqlParameter("@AccountNumber", System.Data.SqlDbType.VarChar);
parameter.Value = "AW00000011";
command.Parameters.Add(parameter);

When running a SQL profiler trace you will now see the proper data type being passed:

exec sp_executesql N'SELECT * FROM CustomerTest WHERE AccountNumber = @AccountNumber',N'@AccountNumber varchar(10)',@AccountNumber='AW00000011'

Where else might this be an issue?

Implicit conversions are not limited to NVARCHAR and CHAR.  These can occur between a variety of different data types.  Below is a table that illustrates this:

Input Data Type Table Data Type Implicit Conversion
NVARCHAR VARCHAR YES
NCHAR CHAR YES
VARCHAR NVARCHAR NO
CHAR NCHAR NO
INT,BIGINT NVARCHAR YES
INT,BIGINT NCHAR YES
INT,BIGINT CHAR YES
INT,BIGINT VARCHAR YES
NVARCHAR INT,BIGINT NO
NCHAR INT,BIGINT NO
VARCHAR INT,BIGINT NO
CHAR INT,BIGINT NO

What kind of impact can this have for me?

As most answers in regards to SQL Server performance, “It depends”.  I’ve seen a database which held customer information for a $30 billion retailer go from 70% cpu usage down to under 5% as a result of fixing their implicit type conversions.  At another client which performed medical billing for over 1000 hospitals, I saw a drop from 70% cpu usage to 40%.  I’ve also seen on multiple occasions where fixing this issue resolved deadlock issues.

Watch us diagnose a query timeout with SQLGrease:

See how you can get query level wait events without Query Store:

 

One thought on “Are VARCHAR or CHAR Hurting Your Performance Due To Implicit Conversions?

Comments are closed.