Here come some tips that you came make use of when it comes to analyze the performance of a specific query, one of the best methods is to view the query execution plan.
Click here to view the page
I have download the same page and maintained it in a doc for my future reference, you can download the same document from here.
Nimish Tandon.
Tips, tricks, tuning and facts that can help to increase the overall performance of your application. The posts will be more targeted towards SQL Server 2000, SQL Server 2005, ASP.Net 1.0, ASP.Net 2.0 and ASP.Net 3.5
Saturday, 20 December 2008
SQL Server Query Execution Plan Analysis
Labels:
Execution Plan Analysis,
Query,
sql server
Sunday, 7 December 2008
SQL Server 2000 procedure non stop execution issue and tips
Recently we had an odd issue with one of our SQL Server 2000 stored procedure. We applied few changes in our database, the changes includes creating new procedures, dropping and recreating the same procedures, deleting and creating new indexes and deleting records from our database. All these updates were bundled in a Hermes package and then executed on our database server.
For those who are not aware of what Hermes package is, I can write another post to explain it in detail. But as of what I can say is that its executing all queries in one go by putting them in a file and can be triggered from a command prompt.(well command prompt is one of the options) and then you get a log file which gives you complete details of the execution.
Now the issue was after running the Hermes package, one of our procedure started having an issue, it was like whenever we execute the procedure even with the parameters which will return Zero results, the procedure just kept on running. There were no Physical I/O made(checked through Enterprise manager -> Current Activity) but the procedure used to keep on running, consuming more and more CPU time. This was the problem, whenever a user tries to run a report which uses this procedure, a new database activity was triggered which actually never ends and when these number of connections were left open for a good amount of time the server used to stop responding or a long delay can be seen in processing each request.
It was quit strange as the rest of the procedures, even the new one's were running fine.
After brain storming with our DBA's we found that the table from where we were deleting the records were being used in our procedure. Now i didn't mentioned it that the procedure only contains a Select query with multiple joins. No update, no delete, just a select.
The table was containing approx 52 million records, and we were deleting approx 47million records from it. I thought that if we are deleting many records from tables then its good if we recreated our indexes as it helps for new index pages to be created and hence will boost the performance. Though we included the script of recreating the indexes in our package it was not helping, our procedure were still having the same issue. Then we found out this UPDATE STATISTICS.
Updating statistics of the tables helped us to resolve the issue. To know more about it i will recommend you to go through the link above.
I am an asp.net programmer and I am reponsible more to optimize my code, as DBA's are there to do their job. But I strongly suggest that we (techie's) should have knowledge about database too, as this give us a complete control on the overall working of our application.
There are few things which I think we need to take care whenever we play around with the database. Please make sure that you give it a look, may be you find something that can help you too.
Indexing
1. Periodically, run the Database Engine Tuning Advisor against current Profiler traces to identify potentially missing indexes.
2. As a rule of thumb, every table should have at least a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases — such as an identity column, or some other column where the value is increasing — and is unique. In many cases, the primary key is the ideal column for a clustered index.
3. Since you can only create one clustered index per table, take extra time to carefully consider how it will be used. Consider the type of queries that will be used against the table, and make an educated guess as to which query (the most common one run against the table, perhaps) is the most critical, and if this query will benefit most from having a clustered index.
4. If a column in a table is not at least 95% unique, then most likely the query optimizer will not use a non-clustered index based on that column. Because of this, you generally don't want to add non-clustered indexes to columns that aren't at least 95% unique.
5. Keep the "width" of your indexes as narrow as possible. This reduces the size of the index and reduces the number of disk I/O reads required to read the index, boosting performance.
6. If possible, avoid adding a clustered index to a GUID column (uniqueidentifier data type).
7. Indexes should be considered on all columns that are frequently accessed by the JOIN, WHERE, ORDER BY, GROUP BY, TOP, and DISTINCT clauses.
8. Don't automatically add indexes on a table because it seems like the right thing to do. Only add indexes if you know that they will be used by queries run against the table.
9. When creating indexes, try to make them unique indexes if at all possible. SQL Server can often search through a unique index faster than a non-unique index because in a unique index, each row is unique, and once the needed record is found, SQL Server doesn't have to look any further.
10. If you perform regular joins between two or more tables in your queries, performance will be optimized if each of the joined columns have appropriate indexes.
11. Don’t automatically accept the default value of 100 for the fill factor for your indexes. It may or may not best meet your needs. A high fill factor is good for seldom changed data, but highly modified data needs a lower fill factor to reduce page splitting.
12. Don't over index your OLTP tables, as every index you add increases the time it takes to perform INSERTS, UPDATES, and DELETES. There is a fine line between having the ideal number of indexes (for SELECTs) and the ideal number to minimize the overhead that occurs with indexes during data modifications.
13. If you know that your application will be performing the same query over and over on the same table, consider creating a non-clustered covering index, or a non-clustered index with included columns, on the table. A covering index, which is a form of a composite index, includes all of the columns referenced in SELECT, JOIN, and WHERE clauses of a query. Because of this, the index contains the data you are looking for and SQL Server doesn't have to look up the actual data in the table, reducing logical and/or physical I/O, and boosting performance. A non-clustered index with included columns is similar to a covering index, except that the extra columns you add to cover the query are not part of the index.
Queries and Stored Procedures
1. Keep transactions as short as possible. This reduces locking and increases application concurrently, which helps to boost performance.
2. Avoid using query hints unless you know exactly what you are doing, and you have verified that the hint actually boosts performance.
3. Encapsulate all transactions within stored procedures, including both the BEGIN TRANSACTION and COMMIT TRANSACTION statements in the procedure.
4. Whenever a client application needs to send Transact-SQL to SQL Server, send it in the form of a stored procedure instead of a script or embedded Transact-SQL. Stored procedures offer many benefits, including:
a. Reduced network traffic and latency, boosting application performance.
b. Stored procedure execution plans can be reused, staying cached in SQL Server's memory, reducing server overhead. This is also mostly true for Transact-SQL code sent to SQL Server outside of a stored procedure.
c. Client execution requests are more efficient. For example, if an application needs to INSERT a large binary value into an image data column not using a stored procedure, it must convert the binary value to a character string (which doubles its size), and send it to SQL Server. When SQL Server receives it, it then must convert the character value back to the binary format. This is a lot of wasted overhead. A stored procedure eliminates this issue as parameter values stay in the binary format all the way from the application to SQL Server, reducing overhead and boosting performance.
d. Stored procedures help promote code reuse. While this does not directly boost an application's performance, it can boost the productivity of developers by reducing the amount of code required, along with reducing debugging time.
e. Stored procedures can encapsulate logic. You can change stored procedure code without affecting clients (assuming you keep the parameters the same and don't remove any result sets columns). This saves developer time.
f. Stored procedures provide better security for your data.
5. SET NOCOUNT ON at the beginning of each stored procedure you write.
6. Before you are done with your stored procedure code, review it for any unused code, parameters, or variables that you may have forgotten to remove while you were making changes, and remove them.
7. For best performance, all objects that are called within the same stored procedure should all be owned by the same object owner or schema, and should also be referred to in the format of object_owner.object_name or schema_owner.object_name.
8. One way to help ensure that a stored procedures query plans are reused from execution to execution of the same stored procedure is to ensure that any user connections information, SET options, database options, or SQL Server configuration options don't change from execution to execution of the same stored procedure. If they do change, then SQL Server may consider the same stored procedures to be different, and not be able to reuse the current query plan stored in cache.
Transact-SQL
1. Don't be afraid to make liberal use of in-line and block comments in your Transact-SQL code, they will not affect the performance of your application and they will enhance your productivity when you or others come back to the code and try to modify it.
2. If possible, avoid using SQL Server cursors. They generally use a lot of SQL Server resources and reduce the performance and scalability of your applications. If you need to perform row-by-row operations, try to find another method to perform the task.
3. When using the UNION statement, keep in mind that, by default, it performs the equivalent of a SELECT DISTINCT on the final result set. In other words, UNION takes the results of two like recordsets, combines them, and then performs a SELECT DISTINCT in order to eliminate any duplicate rows. This process occurs even if there are no duplicate records in the final recordset. If you know that there are duplicate records, and this presents a problem for your application, then by all means use the UNION statement to eliminate the duplicate rows.
4. Carefully evaluate whether your SELECT query needs the DISTINCT clause or not. Some developers automatically add this clause to every one of their SELECT statements, even when it is not necessary.
5. In your queries, don't return column data you don't need. For example, you should not use SELECT * to return all the columns from a table if you don't need all the data from every column. In addition, using SELECT * may prevent the use of covering indexes, further potentially hurting query performance.
6. Always include a WHERE clause in your SELECT statement to narrow the number of rows returned. Only return those rows you need.
7. If your application allows users to run queries, but you are unable in your application to easily prevent users from returning hundreds, even thousands of unnecessary rows of data, consider using the TOP operator within the SELECT statement. This way, you can limit how many rows are returned, even if the user doesn't enter any criteria to help reduce the number or rows returned to the client.
8. Try to avoid WHERE clauses that are non-sargable. If a WHERE clause is sargable, this means that it can take advantage of an index (assuming one is available) to speed completion of the query. If a WHERE clause is non-sargable, this means that the WHERE clause (or at least part of it) cannot take advantage of an index, instead performing a table/index scan, which may cause the query's performance to suffer. Non-sargable search arguments in the WHERE clause, such as "IS NULL", "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE", and "LIKE '%500'" generally prevents (but not always) the query optimizer from using an index to perform a search. In addition, expressions that include a function on a column, expressions that have the same column on both sides of the operator, or comparisons against a column (not a constant), are not sargable. 9. If you are using the SQL Server 2008 debugger, only run it against a test database, never a production database.
For those who are not aware of what Hermes package is, I can write another post to explain it in detail. But as of what I can say is that its executing all queries in one go by putting them in a file and can be triggered from a command prompt.(well command prompt is one of the options) and then you get a log file which gives you complete details of the execution.
Now the issue was after running the Hermes package, one of our procedure started having an issue, it was like whenever we execute the procedure even with the parameters which will return Zero results, the procedure just kept on running. There were no Physical I/O made(checked through Enterprise manager -> Current Activity) but the procedure used to keep on running, consuming more and more CPU time. This was the problem, whenever a user tries to run a report which uses this procedure, a new database activity was triggered which actually never ends and when these number of connections were left open for a good amount of time the server used to stop responding or a long delay can be seen in processing each request.
It was quit strange as the rest of the procedures, even the new one's were running fine.
After brain storming with our DBA's we found that the table from where we were deleting the records were being used in our procedure. Now i didn't mentioned it that the procedure only contains a Select query with multiple joins. No update, no delete, just a select.
The table was containing approx 52 million records, and we were deleting approx 47million records from it. I thought that if we are deleting many records from tables then its good if we recreated our indexes as it helps for new index pages to be created and hence will boost the performance. Though we included the script of recreating the indexes in our package it was not helping, our procedure were still having the same issue. Then we found out this UPDATE STATISTICS.
Updating statistics of the tables helped us to resolve the issue. To know more about it i will recommend you to go through the link above.
I am an asp.net programmer and I am reponsible more to optimize my code, as DBA's are there to do their job. But I strongly suggest that we (techie's) should have knowledge about database too, as this give us a complete control on the overall working of our application.
There are few things which I think we need to take care whenever we play around with the database. Please make sure that you give it a look, may be you find something that can help you too.
Indexing
1. Periodically, run the Database Engine Tuning Advisor against current Profiler traces to identify potentially missing indexes.
2. As a rule of thumb, every table should have at least a clustered index. Generally, but not always, the clustered index should be on a column that monotonically increases — such as an identity column, or some other column where the value is increasing — and is unique. In many cases, the primary key is the ideal column for a clustered index.
3. Since you can only create one clustered index per table, take extra time to carefully consider how it will be used. Consider the type of queries that will be used against the table, and make an educated guess as to which query (the most common one run against the table, perhaps) is the most critical, and if this query will benefit most from having a clustered index.
4. If a column in a table is not at least 95% unique, then most likely the query optimizer will not use a non-clustered index based on that column. Because of this, you generally don't want to add non-clustered indexes to columns that aren't at least 95% unique.
5. Keep the "width" of your indexes as narrow as possible. This reduces the size of the index and reduces the number of disk I/O reads required to read the index, boosting performance.
6. If possible, avoid adding a clustered index to a GUID column (uniqueidentifier data type).
7. Indexes should be considered on all columns that are frequently accessed by the JOIN, WHERE, ORDER BY, GROUP BY, TOP, and DISTINCT clauses.
8. Don't automatically add indexes on a table because it seems like the right thing to do. Only add indexes if you know that they will be used by queries run against the table.
9. When creating indexes, try to make them unique indexes if at all possible. SQL Server can often search through a unique index faster than a non-unique index because in a unique index, each row is unique, and once the needed record is found, SQL Server doesn't have to look any further.
10. If you perform regular joins between two or more tables in your queries, performance will be optimized if each of the joined columns have appropriate indexes.
11. Don’t automatically accept the default value of 100 for the fill factor for your indexes. It may or may not best meet your needs. A high fill factor is good for seldom changed data, but highly modified data needs a lower fill factor to reduce page splitting.
12. Don't over index your OLTP tables, as every index you add increases the time it takes to perform INSERTS, UPDATES, and DELETES. There is a fine line between having the ideal number of indexes (for SELECTs) and the ideal number to minimize the overhead that occurs with indexes during data modifications.
13. If you know that your application will be performing the same query over and over on the same table, consider creating a non-clustered covering index, or a non-clustered index with included columns, on the table. A covering index, which is a form of a composite index, includes all of the columns referenced in SELECT, JOIN, and WHERE clauses of a query. Because of this, the index contains the data you are looking for and SQL Server doesn't have to look up the actual data in the table, reducing logical and/or physical I/O, and boosting performance. A non-clustered index with included columns is similar to a covering index, except that the extra columns you add to cover the query are not part of the index.
Queries and Stored Procedures
1. Keep transactions as short as possible. This reduces locking and increases application concurrently, which helps to boost performance.
2. Avoid using query hints unless you know exactly what you are doing, and you have verified that the hint actually boosts performance.
3. Encapsulate all transactions within stored procedures, including both the BEGIN TRANSACTION and COMMIT TRANSACTION statements in the procedure.
4. Whenever a client application needs to send Transact-SQL to SQL Server, send it in the form of a stored procedure instead of a script or embedded Transact-SQL. Stored procedures offer many benefits, including:
a. Reduced network traffic and latency, boosting application performance.
b. Stored procedure execution plans can be reused, staying cached in SQL Server's memory, reducing server overhead. This is also mostly true for Transact-SQL code sent to SQL Server outside of a stored procedure.
c. Client execution requests are more efficient. For example, if an application needs to INSERT a large binary value into an image data column not using a stored procedure, it must convert the binary value to a character string (which doubles its size), and send it to SQL Server. When SQL Server receives it, it then must convert the character value back to the binary format. This is a lot of wasted overhead. A stored procedure eliminates this issue as parameter values stay in the binary format all the way from the application to SQL Server, reducing overhead and boosting performance.
d. Stored procedures help promote code reuse. While this does not directly boost an application's performance, it can boost the productivity of developers by reducing the amount of code required, along with reducing debugging time.
e. Stored procedures can encapsulate logic. You can change stored procedure code without affecting clients (assuming you keep the parameters the same and don't remove any result sets columns). This saves developer time.
f. Stored procedures provide better security for your data.
5. SET NOCOUNT ON at the beginning of each stored procedure you write.
6. Before you are done with your stored procedure code, review it for any unused code, parameters, or variables that you may have forgotten to remove while you were making changes, and remove them.
7. For best performance, all objects that are called within the same stored procedure should all be owned by the same object owner or schema, and should also be referred to in the format of object_owner.object_name or schema_owner.object_name.
8. One way to help ensure that a stored procedures query plans are reused from execution to execution of the same stored procedure is to ensure that any user connections information, SET options, database options, or SQL Server configuration options don't change from execution to execution of the same stored procedure. If they do change, then SQL Server may consider the same stored procedures to be different, and not be able to reuse the current query plan stored in cache.
Transact-SQL
1. Don't be afraid to make liberal use of in-line and block comments in your Transact-SQL code, they will not affect the performance of your application and they will enhance your productivity when you or others come back to the code and try to modify it.
2. If possible, avoid using SQL Server cursors. They generally use a lot of SQL Server resources and reduce the performance and scalability of your applications. If you need to perform row-by-row operations, try to find another method to perform the task.
3. When using the UNION statement, keep in mind that, by default, it performs the equivalent of a SELECT DISTINCT on the final result set. In other words, UNION takes the results of two like recordsets, combines them, and then performs a SELECT DISTINCT in order to eliminate any duplicate rows. This process occurs even if there are no duplicate records in the final recordset. If you know that there are duplicate records, and this presents a problem for your application, then by all means use the UNION statement to eliminate the duplicate rows.
4. Carefully evaluate whether your SELECT query needs the DISTINCT clause or not. Some developers automatically add this clause to every one of their SELECT statements, even when it is not necessary.
5. In your queries, don't return column data you don't need. For example, you should not use SELECT * to return all the columns from a table if you don't need all the data from every column. In addition, using SELECT * may prevent the use of covering indexes, further potentially hurting query performance.
6. Always include a WHERE clause in your SELECT statement to narrow the number of rows returned. Only return those rows you need.
7. If your application allows users to run queries, but you are unable in your application to easily prevent users from returning hundreds, even thousands of unnecessary rows of data, consider using the TOP operator within the SELECT statement. This way, you can limit how many rows are returned, even if the user doesn't enter any criteria to help reduce the number or rows returned to the client.
8. Try to avoid WHERE clauses that are non-sargable. If a WHERE clause is sargable, this means that it can take advantage of an index (assuming one is available) to speed completion of the query. If a WHERE clause is non-sargable, this means that the WHERE clause (or at least part of it) cannot take advantage of an index, instead performing a table/index scan, which may cause the query's performance to suffer. Non-sargable search arguments in the WHERE clause, such as "IS NULL", "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE", and "LIKE '%500'" generally prevents (but not always) the query optimizer from using an index to perform a search. In addition, expressions that include a function on a column, expressions that have the same column on both sides of the operator, or comparisons against a column (not a constant), are not sargable. 9. If you are using the SQL Server 2008 debugger, only run it against a test database, never a production database.
Saturday, 6 December 2008
Application Performance Freak - My first post ever...
Hi, This is my first blog post. I have been in IT for around 5 years now and working on SQL Server and ASP.Net C#. I am quite a performance freak, so i always try to use the best possible solution available, which can enhances my application performance. It helps me to be a bit different from my colleagues and make sure that my users should not be bothered about applications performance.
Google can be said as a best friend to any IT guy. You have hell lot of things available on net, people are here to help, give you suggestions share their experience and all of these are quite useful to understand and implement in your projects and other works, and google helps us to find all this...that's cool innit.
My main objective behind this blog is post tips, tricks, case studies or any thing else that helps me or that might be helpful to other to implement it.
My source of information will be my learning's, useful articles, different blog post or website which can help us in any way to improve our overall applications performance.
If you think you know any source of info that can help us do let me know.
Thanks,
Nimish Tandon.
Google can be said as a best friend to any IT guy. You have hell lot of things available on net, people are here to help, give you suggestions share their experience and all of these are quite useful to understand and implement in your projects and other works, and google helps us to find all this...that's cool innit.
My main objective behind this blog is post tips, tricks, case studies or any thing else that helps me or that might be helpful to other to implement it.
My source of information will be my learning's, useful articles, different blog post or website which can help us in any way to improve our overall applications performance.
If you think you know any source of info that can help us do let me know.
Thanks,
Nimish Tandon.
Labels:
application,
asp.net,
C#,
performance,
sql server,
storedprocedure
Subscribe to:
Posts (Atom)