Cluster computing

Tuesday, January 21, 2014

The following are some of the tips to improve SQL Server performance based on what has worked before:
1)Normalize logical database design:
Reasonable normalization of the logical database design yields best performance. In a normalized database, there's usually a greater number of narrow tables while there's fewer and wider tables in a de-normalized database. When there are many joins to make in a normalized database, it can hurt performance. However, the optimizer is efficient at selecting joins when indexes are available.
Note that normalization brings in fewer and narrower indexes, fewer NULLs and less redundant data, allows more clustered indexes and facilitates sorting and index creation. It also requires less locks because the data locked is lesser. Typically four-way or greater joins is indicative of when there has been too much normalization.
2) Use Effective Index design
Indexes are usually left out of logical database design and is usually associated with physical database design. Indexes can be dropped, added and changed without affecting the database schema. The optimizer chooses the best index. By properly designing the index and the optimizer selecting the best index, we can improve performance. We do this with:
a) Examine the where clause of the SQL queries, because this is the primary focus of the optimizer
Pick out the slow queries and for each column in the where clause, consider a possible candidate for index.
b) use narrow indexes which are more effective than multi-column compound indexes with fewer index levels. Narrower indexes provide better performance over a wide range of queries.
c) use clustered indexes
Update and Delete operations are often accelerated by clustered indexes because they involve a lot of reading. Queries that return a range of values are often improved with a clustered index where as a non-clustered index helps with fewer queries.
d) examine column uniqueness - finding the number of unique values in a column. If they are a handful, there should not be an index. If it is far less than the number of rows, consider a clustered index. If there are more, there can be a non-clustered index.
e) examine data distribution in indexed columns:
A long running query occurs because a column with few unique values is indexed, or a join on such a column is performed. When the data is not distributed properly, it increases the page I/O, typically one page I/O per non-clustered index at which point, it becomes efficient to scan the entire table. Distribution can be found with say group by and having count(*) > 1
The availability of an index helps an optimizer in its job to make a decision whether to use indexed retrieval or not for improved performance.
3) Use efficient query design
Some types of queries are inherently resource intensive such as
Large result sets
IN, NOT IN, and OR queries
highly non-unique where clauses.
!= comparision operators
certain column functions, such as SUM
expressions or data conversions in where clause
local variables in where clause
complex views with Group by
To mitigate the resource usage, consider restricting the result set. For example, we can add a predicate saying where zip = '98052'

Cluster computing

Tuesday, January 21, 2014

No comments:

Post a Comment