Data Management

Speed up SQL Server database queries with these tips

If you've worked with SQL Server databases for any amount of time, it's practically a given that you've run into slow running queries. Sometimes the reason for the slow response time is due to the load on the system, and other times it is because the query is not written to perform as efficiently as possible (I find this reason much more common). Learn how you may be able to improve performance by knowing what to look for in your query designs.

If you've worked with SQL Server databases for any amount of time, it's practically a given that you've run into slow running queries. Sometimes the reason for the slow response time is due to the load on the system, and other times it is because the query is not written to perform as efficiently as possible (I find this reason much more common). Learn how you may be able to improve performance by knowing what to look for in your query designs.

Operate on small sets

The fewer rows you are dealing with, the quicker you will find what you want. If the query optimizer sees that the number of rows in the table is sufficiently small, no indexes on that table will be used. The optimizer will realize that it is faster to scan the rows in the table to satisfy the request rather than traverse an index structure to find the rows.

Limit the columns returned

When returning data in your applications, the less data that is returned, the faster the information is transmitted over the network -- this goes for the amount of rows returned, as well as the number of columns returned. This is why I am against using the SELECT * statement, especially in a production environment. In my experience with database administration and development, I have seen very few times that have warranted using a SELECT * statement. The reason for this is twofold. It doesn't make sense to bring back columns that you are not going to be using. The second reason (which I feel is more important) is that using SELECT * can break existing code. Consider the following example.

I have an INSERT statement in my production environment. I use a SELECT * statement as a data source in my INSERT statement. This isn't a big deal because my SourceTable has the same number of columns in it as the DestinationTable.

SELECT INTO DestinationTable

(Fld1, Fld2, Fld3)

SELECT *

FROM SourceTable. 

A business situation arises in which I need to add a field to my SourceTable table.

ALTER TABLE SourceTable
ADD Fld4 INT

Adding this new field will break my INSERT statement, which will cause problems in my production environment.

Searching for rows

The manner in which rows are searched for in a database table will always be one of the more vital implementations in your database environment. The SQL Server query optimizer will operate much more efficiently for some WHERE statements as compared to other WHERE statements based upon how the statement is written even if the outcome of the statements is the same.

The following example uses the IN() statement to specify a series of values being searched for. For this example, assume the OrderID column as a NonClustered index.

SELECT * FROM ProductSales
WHERE OrderID IN(4494, 4495, 4496)

This statement is exaactly the same as using an OR operator to specify the three values being searched for. Either statement will cause SQL Server not to use the index on the field and to cycle through the rows in the table searching for the values. Since the values used in the example are contiguous, I can use the BETWEEN operator instead. This will allow the query optimizer to effectively use the index.

SELECT * FROM ProductSales
WHERE OrderID BETWEEN 4494 AND 4496

In general, most types of exclusion statements in your WHERE clause will cause SQL Server to not be able to use an index. The following are some additional examples:

<>,  !, OR, NOT IN, NOT EXISTS

The manner in which LIKE statements are used also makes a difference. If you are able to specify the beginning character(s) of the statement you are searching for, you will have a better chance of your statement using an index. If you specify a wildcard(%) before any type of search string, the optimizer will be unable to use an index.

SELECT * FROM Customers WHERE LastName LIKE '%TR%'

Date searches

Date searches are sometimes a little tricky to perform in the database; for instance, I have seen numerous situations where date functions are used for date searches. In the following example, the sales records are being retrieved from the SalesHistorry table for August 15, 2005:

SELECT SaleID

FROM SalesHistory

WHERE

     MONTH(SaleDate) = 8 AND

     YEAR(SaleDate) = 2005 AND

     DAY(SaleDate) = 15

The functions in the WHERE clause will cause SQL Server to perform the function on every row being searched, which means the index will not be used. This is why I discourage the returns values from functions to be used as criteria in queries. The following code shows how you can rewrite the statement so that an index is used, and the results are returned in a much quicker fashion.

SELECT SaleID

FROM SalesHistory

WHERE

     SaleDate >= '8/15/2005' AND

     SaleDate < ‘8/16/2005'

Next time

In my next column, I plan to look at some of the more advanced techniques that you can use to optimize queries.

Tim Chapman is a SQL Server database administrator who works for a bank in Louisville, KY, and has more than 7 years of IT experience. He is also Microsoft certified in SQL Server 2000 and SQL Server 2005. If you would like to contact Tim, please e-mail him at chapman.tim@gmail.com.

-----------------------------------------------------------------------------------------

Get SQL tips in your inbox

TechRepublic's free SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system. Automatically subscribe today!

About

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.

44 comments
tesfayenator
tesfayenator

Retrieve the name of the employee whose firstname ends with ???E???.Don???t use the ???%??? wild character. if you want to contact me use this email tesfayenator@gmail.com

jimmy_ho
jimmy_ho

In your article, you wrote: SELECT INTO DestinationTable (Fld1, Fld2, Fld3) SELECT * FROM SourceTable. Did you mean: INSERT INTO DestinationTable (Fld1, Fld2, Fld3) SELECT * FROM SourceTable. If that was only a typo, please also correct the following one as well: This statement is exaactly the same as ... Other than these typos, your article is very useful and practical for programmers, even for old-timers like me who have over 20 years of experience in xBASE programming since the era of dBASE II on Apple CP/M.

jimmy_ho
jimmy_ho

In your article, you wrote: SELECT INTO DestinationTable (Fld1, Fld2, Fld3) SELECT * FROM SourceTable. Did you mean: INSERT INTO DestinationTable (Fld1, Fld2, Fld3) SELECT * FROM SourceTable. If that was a typo, please also correct the following typo as well. This is a very practical piece of article and I hope this to be 100% perfect. Thank you for your effort in writing this up.

zivalich
zivalich

Tim, I like the article and gives me some ideas. I have a question about the the last example. Would using between be as efficient or more than using the greater than and less than operators ? Example : SaleDate between '8/15/2005' AND ???8/16/2005'

neal_work
neal_work

I often have to query a large database remotely, and the Date is a crucial part. However, the dates are expressed in Uk format, so the queries with day numbers less than 13 can be ambiguous e.g. is WHERE SaleDate >= '11/8/2005' AND SaleDate < ???12/8/2005' Looking at 11th August, or half of November and December? I have often looked around for guidance on what formats can be used for date literals in SQL, but no luck so far. Can anyone guide me to a comprehensive list, or suggest a 'universal' format (like "2005/08/11")?

shel.cruz
shel.cruz

very informative, keep it up!

lwd
lwd

Thanks, Tim. I hate to say this but in my years of searching for more detailed information about indexing, this is one of the few best nuggets I've found on the subject. I've done lots (and lots and lots....) of comparison tests on my own and have reviewed lots of execution plans and have slowly come to find what does and does not work for me but your article spelled out some specifics that would have been nice to know a few years ago. I've looked at many books trying to find this kind of information on indexes and usually they have some tripe like, "duh, you should used indexes...." No kidding? Thanks for the advice.... Of course, maybe my problem is that I've looked at the wrong books. But seriously, this little article does actually help me. Thanks. Doug

akind
akind

Great article. Thanks for explaining the various ways in which an index can be ignored in a query. I just want to clarify something that looked to me like a typo. You wrote: "In my experience with database administration and development, I have seen very few times that have warranted using a SELECT * statement. The reason for this is twofold. It doesn't make sense to bring back rows that you are not going to be using." Did you mean to say "It doesn't make sense to bring back COLUMNS that you are not going to be using."? Thanks again.

BOUND4DOOM
BOUND4DOOM

Quesiton on using the Select * method. Say you have a stored proc so it is sitting on the server. I have tested this multiple times and multiple ways and not seen any performance difference. What is your experience with something like this. IF EXISTS(Select * From YourTable Where Something = Something) Begin DoSomething End Now I have tried just returning a signle column and as long as I am using the Where clause properly this doesn't seem to matter. Now like you though I never use Select * anywhere else.

cheryl.a.dixon1
cheryl.a.dixon1

He did a great job of telling you what to do and what not to do (if it can be avoided) for the best query optimization. I really liked the simple examples. Thanks!

GunnyMike
GunnyMike

Thanks for pointing out what is not so obvious. I didn't realize that the IN clause causes SQL to not use the index... that is very interesting. We are a small web design company and most of the databases we design and host are on the small side. However, we do have a few that are getting quite large and this is good to know. I know we have used the IN clause quite a bit. What are your thoughts on this... (for large databases) Step 1. Create a temp table out of the list of Id's that make up the IN clause Step 2. Use an inner join on the temp table to bring back the results. Thanks, Michael Riley Webfodder, Inc.

awf
awf

That datetime fields are stored as a single number, if I remember right. The "index" your article refers to. SQL server just displays in date/time format. It doesn't store it that way. I've been doing the >'startdate' AND < 'enddate' for years. I have read a lot of contradictory information concerning speed when using JOINS vs. "WHERE x = y"ing everything in the WHERE clause (if that technique has a name, I've never learned it). All of my joins are on well-indexed values. So which is faster, all things being equal (pardon syntax issues): SELECT field FROM tblA inner join tblB on field WHERE field > 10 or SELECT field FROM tblA, tblB WHERE field > 10 AND tblA.idx = tblB.idx BTW, nice to see a Louisvillian here...live in etown and work in Jeff myself (when not telecommuting).

thisisfutile
thisisfutile

Wow, another great article Tim. I'm probably at the perfect point in my IT career to be reading this because it all "sank in" and now it's brought out some questions: 1) An exclusion operator (for example: ) might cause the query to not use an index...is that assuming the column has multiple values. What if a column contains a constraint of 'Y' or 'N' and I do a search WHERE col = Y and then another search WHERE col N. I'd assume my speed results would be the same, but does the presence of the "not equal to" operator make it take longer? 2) Using functions in the WHERE clause means they execute for every row? Wow, that changes everything. I just assumed that during a query, the WHERE clause would execute any functions once and hold the result until the end of the query. The example you used with dates and date functions is a good one because nearly every query I write, if I need to check against a date, I use a self-created date conversion function to ensure that if it's a datetime field, I'm stripping the time. I shall rethink this in the future (and probably restructure some existing functions and procedures). With all this talk about speed, I'd really like a summary of the optimizer functions in Query Analyzer. You may already have plans for this, but could you briefly discuss the "Show Execution Plan", "Show Server Trace" and "Show Client Stats" options in one of your future posts. I"ve looked at them once or twice but just don't have the time to dig into their true meanings. Even just a brief summary of how to compare two query results would be great!

sarki32
sarki32

I have all my lyrics in a database. It's a huge database witch is working slow, because of the large amount of rows >150.000 in db. So that's why i made static pages for my site, even if i take more space from HDD. But now i want to implement a search engine based on my lyrics database. My question is how to make a php search engine witch will run fast enough for my site Many thanks Message was edited by: beth.blakely@...

chapman.tim
chapman.tim

The two are actually about the same in terms of index usage and performance.

jeffrob
jeffrob

I too was wondering if 'between' is less efficient.

Tony Hopkinson
Tony Hopkinson

YYYY-MM-DD with literal dates in SQL Server queries. Saves a lot of heartache. Oposite problem to yourself, usually caused by some twit in the UK with their date time set to US. I've got tripped up a few times with auto conversions using the local environment, now I try to avoid it as all costs using explicit conversions where ever necessary.

unclebiguns
unclebiguns

Neal, To the best of my knowledge, SQL Server interprets the date based on the Language setting. So in your example, the dates are August 11 to August 12 2005. You could issue the Set DateFormat command as part of your query to force it to be any supported format. For example, Set DateFormat MDY Select x from y where SaleDate = '11/8/2005' now considers this date to be november 8 2005. From SQL Server 2005 BOL: "The advantage in using the ISO 8601 format is that it is an international standard. Also, datetime values that are specified by using this format are unambiguous. Also, this format is not affected by the SET DATEFORMAT or SET LANGUAGE settings." You can find this by looking for Using Date and Time Data.

Ivy Clark
Ivy Clark

Yup, 'Use indexes', duh. And now we know it doesn't always work and why! Thanks for sharing! Cheers!

chapman.tim
chapman.tim

Misread it the first time...it is supposed to say COLUMNS, not rows. I'll have it corrected.

chapman.tim
chapman.tim

In the article I am looking at, it reads "The reason for this is twofold. It doesn?t make sense to bring back rows that you are not going to be using. "

chapman.tim
chapman.tim

It probably doesn't matter there because it is inside an EXISTS() clause..which means that it will return true for the first row it returns...ie. It doesn't really return any columns. Your statement is the same as IF EXISTS(Select 1 From YourTable Where Something = Something). I never use a SELECT * to return data, or INSERT data.

chapman.tim
chapman.tim

Its difficult to say without looking at the code. Its always good to test different methods for the same query to see what turns out to be faster, etc. A temp table might be a good place to start for your IN() statements. It may/may not make if faster...it might be worth investigating.

Gast?n Nusimovich
Gast?n Nusimovich

When trying to solve an query, its better to test more than one approach and compare performance measures (like comparing the results of execution plans of all the alternatives). If you compile you script into a stored procedure, recompile the SP every now and then, to allow for changes in the dynamic execution context of the database. In relation to alternatives to the IN clause, you can try both the temp table approach and a subquery approach, and then compare performance metrics of the approaches.

chapman.tim
chapman.tim

It does store the date and time together for a DATETIME data type...the first 4 bytes are the number of days before or after the base date (Jan 1, 1900), and the 2nd 4 bytes store the time of day as milliseconds after midnight. As for your question regarding the two join statements: they are exactly the same. SQL Server is smart enough to convert them to be the same thing. In actuality, I believe that the JOIN statement is more or less syntactic sugar...it works the exact same as using a WHERE clause. Where do you work in Jeffersonville? Drop me an email.

chapman.tim
chapman.tim

Thanks, Im glad the article helped. Here are some answers to your questions: 1. Most likely, an index isn't really going to help you on a column that isn't highly selective (such as only having values of Y or N). You will likely incur a table scan either way. On other fields that are highly selective, any time you use a construct that excludes data, the index will likely not be used. The reason is that it will have to search every row to make sure the value in that row isn't the value. For instance, it is much easier to search for all values from 1 to 100 that are evenly divisible by 10. It is much harder to find the ones that are not evenly divisible by 10. 2. All the documentation from MS, and all of the experiences I have had show that functions in the where clause will execute for every row. The reason is that you can pass table values into the functions if necessary. If you have one constant value assigned by a function, store it in a variable and use that variable in the WHERE clause. In my next article I plan on focusing on more advanced query tuning. That article will include STATISTICS IO. I haven't decided yet, if I will include execution plans in that article. I may save that for a future article.

ravishakya
ravishakya

I believe you could use FullText indexing and search feature offered by most of the RDBMS.

jeffrob
jeffrob

Now I can use 'between' without wondering whether I'm unecessarily having a negative impact.

neal_work
neal_work

That YY-MM-DD format should suit me fine. Thanks

jeffrob
jeffrob

Great tip. Thanks! I'm relatively new to SQL Server (a year or so), although I've worked for several years on Oracle databases, so this whole discussion thread has yielded several gold nuggets. One thing that I've found to cause a lot of misunderstanding for people querying the database in our company is the fact that if only a date is specified, the time defaults to 12:00 A.M. (Midnight). Of course, meaning that if I just use '2007-09-14' I won't get any data except records with a timestamp of 2007-09-14 00:00:00.000 and not from midnight on, which is what they actually wanted. Most don't know this and I often get a complaint that data isn't correct when the problem was the exclusion of the time in the date format. To get the full day's data they would need to use something like: between '2007-09-14 00:00:00' and '2007-09-14 23:59:59' Is there any way around that? We have hundreds of people using various databases and I have no idea who they all are, so educating the populace is not a real option.

akind
akind

The syntax of a SELECT statement is SELECT column_name(s) FROM table_name so if you say SELECT * you are bringing back all the COLUMNS, not all the ROWS.

alaniane
alaniane

on SELECT *. I always want to know exactly what I am bringing back without having to go to the datatable each time. The only time I use SELECT * is when I am debugging a query and the temp table has numerous columns. I don't want to have to type out all of the columns in the select list for something that I am going to eventually remove from the proc or script. It was interesting knowing that indexes are not used when there is an IN or exclusion operator. I will have to re-check some of the procs to see if they can be tweaked by changing WHERE clause.

BOUND4DOOM
BOUND4DOOM

I kind of figured that was what it was doing, or at least hoped, however I could never find any docs on what it was really doing. But since I could never get a perf increase or decrease I could only assume, and this thread seemed like the perfect place and oportunity to ask that question. I would imagine depending on the database there is an option to override the behavior but I have never ran into that as well. A lot of programming languages are set up this way. For Example in C# say int he example below Both Method returned true if(DoMethod1() || DoMethod2()) {} In that example Method 1 would fire and eveluate to true so method 2 would never fire. if(DoMethod1() | DoMethod2()) {} Would actually still force both methods to fire then evaluate to true. I just never knew if SQL did that or not.

rapp000
rapp000

i used to write IF EXISTS(Select * From TableName where condition = criteria) to know wether i should insert or to update a row in a table, but then i just put the update into a stored procedure like this: update TableName set Field1 = @Field1 where ID = @ID IF @@ROWCOUNT = 0 INSERT INTO TableName(Field1) VALUES (@Field1)

r937
r937

try doing an outer join without using JOIN syntax ;o) JOIN syntax is more than sugar -- it can help you think properly, and is certainly easier to understand when you have to pick up and modify someone else's query

t3knotic
t3knotic

I had a similar problem. I started with a database with a few thousand rows, and it is now currently at over 20million. I want a full text search on that table. I am using MySQL, and installed a 3rd party index engine called Sphinx. It is free, open source. It queries all 20million rows in less then 1/10th of a second. Let me know if I can help you in any way

jeffrob
jeffrob

Thanks for the tip about using ISO format. We are an international company and the the SET LANGUAGE and SET DATE FORMAT is a consideration.

jeffrob
jeffrob

Thanks, Tim. I'll try to make sure everybody gets the word. May God bless you in your work.

aksagar
aksagar

You can use ISO format. Or I should say you should use ISO format because it is the safest format. It is not effected by SET DATE FORMAT or SET LANGUAGE settings e.g. >= '20070914' AND < '20070915' OR >= convert(datetime, '9/14/2007', 112) AND < convert(datetime, '9/15/2007', 112)

chapman.tim
chapman.tim

You could try >= '9/14/2007 AND < '9/15/2007'

chapman.tim
chapman.tim

Sure, below is how you do outer joins without using the JOIN syntax. Although, I am going to be honest, I don't know how to do a full outer join in this style, but it can probably be done. When I said "syntactic sugar", I meant that it does make it easier to program joins with them...which I imagine it is one of the reasons it became an ANSI standard. I only use JOIN operators...but in the end, they ar e the same as using a WHERE clause. When you said "JOIN syntax is more than sugar -- it can help you think properly, and is certainly easier to understand when you have to pick up and modify someone else's query "....that is exactly what syntactic sugar is....it makes life easier. :) (LEFT JOIN) SELECT Fld1, Fld2 FROM Table1, Table2 WHERE Table1.Fld1 *= Table2.Fld1 (RIGHT JOIN) (LEFT JOIN) SELECT Fld1, Fld2 FROM Table1, Table2 WHERE Table1.Fld1 =* Table2.Fld1