Parameter Sniffing & Stored Procedures Execution Plan
This post is an attempt to explain what parameter sniffing is all about and how it affects the performance of a stored procedure. I would be walking through an example to demonstrate what the effect of parameter sniffing is. I will be showing the query execution plan generated for SQL Server 2005. To my knowledge, there is no change, in how this process works, between the two versions and you should be able to relate it to SQL Server 2000 as well.
According to the white paper, Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005 published in the Microsoft Site:
"Parameter sniffing" refers to a process whereby SQL Server's execution environment "sniffs" the current parameter values during compilation or recompilation, and passes it along to the query optimizer so that they can be used to generate potentially faster query execution plans. The word "current" refers to the parameter values present in the statement call that caused a compilation or a recompilation.
Before I go ahead with the example, you need to understand that the query execution plan generated by the query optimizer depends on a lot of factors and parameter sniffing is just one of them. So the execution plans I show here might not be the execution plan you get if you run the same query on your server.
Lets consider the Orders table in the Northwind database.
Say, I create a stored procedure with the definition as:
create procedure GetOrderForCustomers(@CustID varchar(20))
as
begin
select * from orders
where customerid = @CustID
end
Remember, that the query execution plan is not generated when you create the procedure. It gets created and cached the first time you run it.
First lets look at the distribution of the number of orders for each customer
Running the following query in the Northwind database:
select top 100 percent customerid,count(*) as OrderCount from orders
group by customerid order by count(*)
We get the number of orders placed by each customer (Only partial result shown)

The first and the last customer are the ones we are interested in:
CENTC has OrderCount (min) = 1
SAVEA has OrderCount (max) = 31
Lets say, I want to find all the orders for CustomerID = 'SAVEA' by calling the stored procedure that we created before.
exec GetOrderForCustomers 'SAVEA'
Since this is the first time the stored procedure is called, it will create an optimized query execution plan and execute it.
Now, the stored proc will return me 31 rows. Let's look at the query execution plan.
Two values in the Clustered Index Scan information are of interest to us.
- Actual Number of Rows 31
- Estimated Number of Rows 31.0747
How did the optimizer correctly estimate the actual number of rows?
It's because of what we call "parameter sniffing". The optimizer created the plan knowing the fact that it was going to get the information for the customerID 'SAVEA' and hence retrieve 31 rows.
Then how did the optimizer know that 'SAVEA' has 31 orders?
SQL Server internally maintains the statistics and distribution of the values in the columns used for filtering. Which, inturn, is nothing but the information in the result of this query (which we used above)
select top 100 percent customerid,count(*) as OrderCount
from orders group by customerid
order by count(*)
Then, what is the problem with parameter sniffing, if its helping the query optimizer do better optimization?
Check out the query execution plan, again, for a different input (CustomerID = 'CENTC'):
exec GetOrderForCustomers 'CENTC'
Did you notice that the estimated number of rows is still 31 when the actual number of rows is 1? This plan was optimized for retriving 31 rows (in the first run) and the plan stays in the cache for reuse till server is restarted or the procedure is recompiled or if the proc cache is removed because of any other reason. And so, we need to understand that this plan may or may not work with the same efficiency for retrieving 1 row.
Now, lets see what is the efficient plan (in my server) for retrieving the orders for 'CENTC'. For that I need to clear the current execution plan. Let's drop and recreate the stored procedure so that the plan cache is removed.
drop proc GetOrderForCustomers
go
create procedure GetOrderForCustomers(@CustID varchar(20))
as
begin
select * from orders
where customerid = @CustID
end
Now that the plan cache is cleared, a fresh query execution plan will be generated for the input I give in the following procedure call.
exec GetOrderForCustomers 'CENTC'
Let's look at the query execution plan now.
Now, do you notice that the plan has changed and the estimates also have changed
- Actual Number of Rows 1
- Estimated Number of Rows 1.00241
If you call the procedure again with 'SAVEA' as the input, you will see that the plan will be the same and estimated number of rows will still be 1 though the actual rowcount will be 31.
So, either way it all depends on which input I give the first time I call the stored procedure.
But, It's impossible to keep track of when the plan was created and what was the input it used. In that case, we have an option here to disable the dependency on the input parameter.
We can create local variables and assign the input parameter to the local variables and use the local variables in the query. And in that case, since we didn't use the procedural parameter directly in the query, it will generate a generic query execution plan.
The same stored procedure can be written this way to avoid parameter sniffing:
create procedure GetOrderForCustomersWithoutPS(@CustID varchar(20))
as
begin
declare @LocCustID varchar(20)
set @LocCustID = @CustID
select * from orders
where customerid = @LocCustID
end
Now, when you run the stored procedure with the input as 'SAVEA'
exec GetOrderForCustomersWithoutPS 'SAVEA'
This is the query execution plan that I get.
Here you see that the actual number of rows is 31 whereas the estimated number of rows is not 31, but its close to 9. Its clear in this case that the query optimizer did not use the parameter value for generating the query execution plan. Then, from where did it get the value 9?
You may guess, just like I did. We saw the distribution of the number of orders for each customer. Now, lets find the average number of orders per customer.This query finds the average of ordercount of all customers we got from the previous query:
select avg(OrderCount) from
(select top 100 percent customerid,count(*) as OrderCount
from orders
group by customerid
order by count(*)) b
Or it can be simply written this way:
select 1.0*count(*)/count(distinct customerid) from orders
You will see that the output is:
Well, this is the estimated rowcount that we saw in the execution plan.If you look at it, for the current data, 67 out of 89 distinct customers have ordercount between 4 (9-5) and 14 (9+5) orders. So, execution plan generated should work good for the majority of the customers. Hence, disabling parameter sniffing is a good choice in this case.
A few points to note:
- Parameter sniffing can be enabled or disabled at the stored procedure level.
- We need not use local variables (to disable parameter sniffing) if the amount of data you will retrieve from the table is evenly distributed for all values of the filtered column (example, search by primary key or unique key)
- Parameter sniffing can be disabled (by using local variables) if we see a bell curve distribution in the number of rows retrieved for the filtered column.


21 comments:
Very interesting article.
One wonders though why this doesn't happen more often, as most queries tend to return resultsets of vastly different sizes. Yet the parameter sniffing problem seems to only occur in certain types of (complex) queries.
In any event, thanks for the solid info which allowed us to rapidly solve a performance problem in one of our production system.
Henri
Very nice write-up - thanks :)
Thank-you very much!
This fixed a performance problem in a recursive function, where the plan seemed to get progressively worse the deeper the recursion went, and RECOMPILE didn't work.
Phew!
Thanks, this was really driving me to distraction.
I simply couldn't work out why a stored procedure would take an hour and a half and the same code in a batch completed in a minute.
It really made that much difference.
Keep up the good work.
Oh~my god !!
you are my god because you slove my problem.
A fews days ago I helped my friend resolve a query of similar nature. In a batch the query took about a minute. When he turned it into a stored procedure it run for about 9hours and going...I was able to re-write the query using a derived table instead of the normal join and it run fine.
Here is the nature of his query: It the takes a string of comma separated values. Parses the strings and puts into a temporary table. Then uses the temporary table to join with some other tables.
Your post makes a lot of sense!
Thanks, Kuyi.
Ultimate
Really good information with full of details.
thank you very much...
keep going with good articles.
Fantastic little trick that sped up my stored procedure from 20 minutes to 2 seconds. Thanks!
Now, I'm a bit confused as to when it's best to use Local variables and when not to.
I've always followed the recommendation from Microsoft to avoid the use of Local variables to achieve high performance stored procedures based on the following article:
http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx
and this one:
http://www.sqlprof.com/blogs/sqlserver/archive/2008/03/29/avoid-using-local-variables-in-your-select-statements.aspx
Any comments suggestions would be welcome.
Thanks,
Phil
Phil,
If you are looking for a straight yes or no.... :)
What the technet poster suggests is that using an OPTION RECOMPILE is better than using a local variable, which is fine as long as coming up with a new execution plan with the current input parameters doesn't add too much of an overhead.
RECOMPILE option is definitely something you can consider which I have missed in my post.
But I still stand by my suggestions for the specific cases I have stated.
-Omni
Great article. Crazy how something you wrote so long ago is still so relevant. I'm smarter for having read it. I will now subscribe to your blog.
You are the man!
This is very helpful. Thanks a million.
This is nice information... thanks a lot!
NICE, thank you very much, I got a store procedure, that takes 14:17 minutes, and with the change, it now takes 30 seconds!!!
Very educative piece...A developer at work pointed me to this blog and I'm smarter now because I read it. The little things never cease to matter...Good job on this.
Awesome ! Thank you for a great article and information.
Saved my proverbial "arse"
Thank you
That'd too good ..
Clear explanation ... Even i amnew to SQL.. i can understand ... please Continue ... writing such articles .
Thanks for detailing "parameter sniffing". It help me debug a slow stored procedure, which had an XML input parameter.
http://www.oaktreesw.net
Very helpfull article!
PingBack:
http://madebysql.blogspot.com/2009/07/execution-time-difference-between-sp.html
Post a Comment