Sunday 19 December 2010

Count(*) Confusion

There still seems to be an amount of confusion regarding the use of COUNT(1) and COUNT(*). Since SQL Server 2000, query optimizer has produced identical query plans for both of these queries, but many people still seem to think that COUNT(*) is more expensive.
To settle the debate once and for all, I issued the following query against the Adventureworks database...
 
SELECT COUNT(*) FROM Person.Contact
...and it unsurprisingly produced the following query plan...
                      ...<DefinedValue>
                        <ColumnReference Column="Expr1004" />
                        <ScalarOperator ScalarString="Count(*)">
                          <Aggregate AggType="countstar" Distinct="0" />
                        </ScalarOperator>
                      </DefinedValue>...
                   
...I then issue this query against Adventureworks
SELECT COUNT(1) FROM Person.Contact
And this is the plan it produced...
                  ... <DefinedValue>
                        <ColumnReference Column="Expr1004" />
                        <ScalarOperator ScalarString="Count(*)">
                          <Aggregate AggType="countstar" Distinct="0" />
                        </ScalarOperator>
                      </DefinedValue>...
...Now, the sceptics amongst you, are probably thinking, "Oh yes, very good, but surly optimizer just re-used the original plan, because it was good enough..." and that would be a logical assumption, so to test this, I ran the following...
 
DBCC FREEPROCCACHE
...This command flushes the query cache, without the need to re-start the Instance. So at this point, SQL Server was not holding any cached plans, and hence it would HAVE TO create a new plan for the next query, and would obviously choose the plan that it found optimal. So now I ran the query again...


SELECT COUNT(1) FROM Person.Contact

 
And once again, the following plan was used...

                      ...<DefinedValue>
                        <ColumnReference Column="Expr1004" />
                        <ScalarOperator ScalarString="Count(*)">
                          <Aggregate AggType="countstar" Distinct="0" />
                        </ScalarOperator>
                      </DefinedValue>...

...The reason for this happening, is because if you use a constant, as a value in the expression of a COUNT(Exp), optimizer is intelligent enough to know that you are actually doing a COUNT(*), and uses the same plan...So despite popular belief, there will never be any difference in cost between a COUNT(1) and a COUNT(*).

No comments:

Post a Comment