Firstly, the example given here is an extremely poor one because Using IN creates a temp table for every item in the IN list. If each of these subqueries only returns 10 records, this query will create 1000 temp tables. Very inefficient. SQL is not designed to handle IN lists very well, but it IS designed to handle joins. That's why relational databases were created!
Secondly, whenever I see a subquery, all it tells me is that the person doing the coding doesn't know what they're doing. I have NEVER seen a subquery that couldn't be written cleaner and more efficiently with the proper use of joins.
Discussion on:
View:
Show:
I've never heard of the temp table being created for every record returned in a subquery...so Id love to see some documentation that says anything like it. It does create a virtual table to use though. Also, SQL IS designed to handle subqueries just fine...in fact, sometimes it is more effecient to use a subquery than use a join...it is rare, but I've seen it happen. The main hinderance is that when you perform a subquery using the IN operator, a distinct list of values is returned, and DISTINCT does incur costs. Subqueries can be usually written as JOINs, but it is a very bold, and incorrect, statement to say that someone who uses a subquery doesn't know what they are doing.
I didn't say it created a temp table for every record returned by a sub query, I said it creates a tempt table for every element in an IN clause - regardless of where the elements come from. In other words, it's not the sub-query that creates the temp tables, it's the parent query as it's parsing through the IN elements.
Worked hand-in-hand with a Microsoft SQL 7/2000 Implementation team a few years ago while writing County and State property tax collection software. That's word for word what the lead guy said. At the time we were using IN with a list-string, but (according to him), where the IN elements come from doesn't matter. Better to create a temp table and then do a join than use an IN clause with more than a few entries. I would assume that hasn't changed much with the upgrade to 2005 since most of the upgrade involves object granularity and XML implementation, not core engine functionality.
...maybe you were confused as to what he was talking about. When you do use an IN() statement with literals, (or subqueries for that matter), the values returned are used by the optimizer as one large OR statement...you can look at the argument text from the execution plan. I've not found any evidence where a ton of temp tables are created for a simple IN statement, so I really doubt that it occurs.
Could you please post an example of how you might change the given example to improve it using joins?
Do you have benchmark tests to prove your point(s)? If so, please post those results as well.
Thanks.
Do you have benchmark tests to prove your point(s)? If so, please post those results as well.
Thanks.
SELECT
DISTINCT c.LastName,
c.FirstName
FROM ( ( ( Person.Contact c JOIN
HumanResources.Employee e ON
e.ContactID = c.ContactID
) JOIN
Sales.SalesOrderHeader s ON
s.SalesPersonID = e.EmployeeID
) JOIN
Sales.SalesOrderDetail d ON
d.SalesOrderID = s.SalesOrderID
) JOIN
Production.Product p ON
p.ProductID = d.ProductID
WHERE p.ProductNumber LIKE 'FW%'
Both I/O Cost and CPU Cost are lower for this version by about 50%
DISTINCT c.LastName,
c.FirstName
FROM ( ( ( Person.Contact c JOIN
HumanResources.Employee e ON
e.ContactID = c.ContactID
) JOIN
Sales.SalesOrderHeader s ON
s.SalesPersonID = e.EmployeeID
) JOIN
Sales.SalesOrderDetail d ON
d.SalesOrderID = s.SalesOrderID
) JOIN
Production.Product p ON
p.ProductID = d.ProductID
WHERE p.ProductNumber LIKE 'FW%'
Both I/O Cost and CPU Cost are lower for this version by about 50%
I tested the query several times in my PC using the default Adventurework exmaple DB.
The result is the author's query run faster and less I/O and CPU then yours.
I think the IN subquery is not necessary bad.
The result is the author's query run faster and less I/O and CPU then yours.
I think the IN subquery is not necessary bad.
pun intended. I agree with the article author. Of course it depends on the data, number of records, table indexes. In many cases I've reduced query processing time significantly with IN. If you're on a production box, you will have enough meat and magma inside so SQL Svr will clean up after you're done. If you have to worry about temp tables cleanup, if they are indeed created, perhaps you need a little more muscle. Also it is nice to have that flexibility if you need it.
I'd have to say that I am in agreement here. The answer to most T-SQL and SQL Server questions that I come across or pose would tend to be 'it depends'.
I've certainly used the nested-query approach but I'm also aware of the overheads involved in parsing very large datasets that can be returned by sub-queries. So, if I think that a problem warrants a more relational solution, then I'll use JOINS. Common sense I guess in many instances.
In closing I'd have to say to Jeff_D_Programmer that explicitly criticizing someone's programming skills in a forum is not considered to be good netiquette. By all means offer constructive advice but saying that 'all it tells me is that the person doing the coding doesn't know what they're doing' is not very constructive.
YaHozna.
I've certainly used the nested-query approach but I'm also aware of the overheads involved in parsing very large datasets that can be returned by sub-queries. So, if I think that a problem warrants a more relational solution, then I'll use JOINS. Common sense I guess in many instances.
In closing I'd have to say to Jeff_D_Programmer that explicitly criticizing someone's programming skills in a forum is not considered to be good netiquette. By all means offer constructive advice but saying that 'all it tells me is that the person doing the coding doesn't know what they're doing' is not very constructive.
YaHozna.
Jeff_D_Programmer does mention that he knows SQL Server 03 in his member profile, so consider the source.
boo hiss rubbish
move that bit in the IN clause to a JOIN for heavens sake
move that bit in the IN clause to a JOIN for heavens sake
- Keyboard Shortcuts:
- Prev
- Next
- Toggle









































