Monday, May 4, 2009

Clustered Indexes && NON Clustered Indexes

Clustered Indexes
A clustered index stores the actual data rows at the leaf level of the index. Returning to the example above, that would mean that the entire row of data associated with the primary key value of 123 would be stored in that leaf node. An important characteristic of the clustered index is that the indexed values are sorted in either ascending or descending order. As a result, there can be only one clustered index on a table or view. In addition, data in a table is sorted only if a clustered index has been defined on a table.

Note: A table that has a clustered index is referred to as a clustered table. A table that has no clustered index is referred to as a heap.

Non-clustered Indexes
Unlike a clustered indexed, the leaf nodes of a non-clustered index contain only the values from the indexed columns and row locators that point to the actual data rows, rather than contain the data rows themselves. This means that the query engine must take an additional step in order to locate the actual data.

A row locator’s structure depends on whether it points to a clustered table or to a heap. If referencing a clustered table, the row locator points to the clustered index, using the value from the clustered index to navigate to the correct data row. If referencing a heap, the row locator points to the actual data row.

Non-clustered indexes cannot be sorted like clustered indexes; however, you can create more than one non-clustered index per table or view. SQL Server 2005 supports up to 249 non-clustered indexes, and SQL Server 2008 support up to 999. This certainly doesn’t mean you should create that many indexes. Indexes can both help and hinder performance, as I explain later in the article.

In addition to being able to create multiple non-clustered indexes on a table or view, you can also add included columns to your index. This means that you can store at the leaf level not only the values from the indexed column, but also the values from non-indexed columns. This strategy allows you to get around some of the limitations on indexes. For example, you can include non-indexed columns in order to exceed the size limit of indexed columns (900 bytes in most cases).

Index Types
In addition to an index being clustered or non-clustered, it can be configured in other ways:

Composite index: An index that contains more than one column. In both SQL Server 2005 and 2008, you can include up to 16 columns in an index, as long as the index doesn’t exceed the 900-byte limit. Both clustered and non-clustered indexes can be composite indexes.
Unique Index: An index that ensures the uniqueness of each value in the indexed column. If the index is a composite, the uniqueness is enforced across the columns as a whole, not on the individual columns. For example, if you were to create an index on the FirstName and LastName columns in a table, the names together must be unique, but the individual names can be duplicated.
A unique index is automatically created when you define a primary key or unique constraint:


Primary key: When you define a primary key constraint on one or more columns, SQL Server automatically creates a unique, clustered index if a clustered index does not already exist on the table or view. However, you can override the default behavior and define a unique, non-clustered index on the primary key.
Unique: When you define a unique constraint, SQL Server automatically creates a unique, non-clustered index. You can specify that a unique clustered index be created if a clustered index does not already exist on the table.
Covering index: A type of index that includes all the columns that are needed to process a particular query. For example, your query might retrieve the FirstName and LastName columns from a table, based on a value in the ContactID column. You can create a covering index that includes all three columns.
Microsoft SQL Server supports two types of indexes:

Clustered indexes define the physical sorting of a database table’s rows in the storage media. For this reason, each database table may have only one clustered index. If a PRIMARY KEY constraint is created for a database table and no clustered index currently exists for that table, SQL Server automatically creates a clustered index on the primary key.
Non-clustered indexes are created outside of the database table and contain a sorted list of references to the table itself. SQL Server 2000 supports a maximum of 249 non-clustered indexes per table. However, it’s important to keep in mind that non-clustered indexes slow down the data modification and insertion process, so indexes should be kept to a minimum
One of the hardest tasks facing database administrators is the selection of appropriate columns for non-clustered indexes. You should consider creating non-clustered indexes on any columns that are frequently referenced in the WHERE clauses of SQL statements. Other good candidates are columns referenced by JOIN and GROUP BY operations.

You may wish to also consider creating non-clustered indexes that cover all of the columns used by certain frequently issued queries. These queries are referred to as “covered queries” and experience excellent performance gains.

Advantages:

Indexes allow you to speed query performance on commonly used columns and improve the overall processing speed of your database.

Disadvantages:

Although indexes can reduce access time significantly, they can also have adverse effects on performance. Before you create indexes, consider the effects of multiple indexes on disk space and processing time:

· Each index requires storage or disk space. The exact amount depends on the size of the table and the size and number of columns in the index.

· Each INSERT or DELETE operation performed on a table requires additional updating of each index on that table. This is also true for each UPDATE operation that changes the value of an index key.

· The LOAD utility rebuilds or appends to any existing indexes.

The index free space MODIFIED BY parameter can be specified on the LOAD command to override the index PCTFREE used when the index was created.

· Each index potentially adds an alternative access path for a query for the optimizer to consider, which increases the compilation time.

Choose indexes carefully to address the needs of the application program.

Difference B/W Clustered and Non-Clustered Index:

There are clustered and non-clustered indexes. A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages.

A non-clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a non-clustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.

Optimization tips

Every index increases the time in takes to perform INSERTS, UPDATES and DELETES, so the number of indexes should not be very much. Try to use maximum 4-5 indexes on one table, not more. If you have read-only table, then the number of indexes may be increased.
Keep your indexes as narrow as possible. This reduces the size of the index and reduces the number of reads required to read the index.
Try to create indexes on columns that have integer values rather than character values.
If you create a composite (multi-column) index, the order of the columns in the key is very important. Try to order the columns in the key as to enhance selectivity, with the most selective columns to the leftmost of the key.
If you want to join several tables, try to create surrogate integer keys for this purpose and create indexes on their columns.
Create surrogate integer primary key (identity for example) if your table will not have many insert operations.
Clustered indexes are more preferable than non-clustered, if you need to select by a range of values or you need to sort results set with GROUP BY or ORDER BY.
If your application will be performing the same query over and over on the same table, consider creating a covering index on the table.
You can use the SQL Server Profiler Create Trace Wizard with “Identify Scans of Large Tables” trace to determine which tables in your database may need indexes. This trace will show which tables are being scanned by queries instead of using an index.
You can use the sp_MSforeachtable undocumented stored procedure to rebuild all indexes in your database. Try to schedule it to execute during CPU idle time and slow production periods.
sp_MSforeachtable @command1=”print ‘?’ DBCC DBREINDEX (’?')”

No comments: