Maintain indexes in a SQL Server 2005 database
In this tutorial you will learn about maintaining Indexes in a SQL Server 2005 database – Dealing with Fragmented indexes, Reorganizing an Index, Rebuilding an Index, Disabling Non-clustered Indexes to Reduce Disk Space During Rebuild Operations, Rebuilding Large Indexes, Setting Index Options, Page Splits and Performance Considerations, Max Degree of Parallelism, Online Index Operations, Partition Index Operations, Statistical Information, Asynchronous Statistics Updates, Disabling Automatic Statistics, Statistics after Upgrading a Database to SQL Server 2005, Bulk copy options and Index operation logging.
The DBA has to ensure optimal performance of the database. One of the key elements in database maintenance is to ensure that minimum disk I/O is performed in queries. Making good indexes is a means to this end. The activities a database administrator performs to maintain his indexes are:
- Reorganizing and rebuilding indexes. This process involves defragmenting of indexes and optimization of disk space.
- Using the Fill factor option to fine tune the index data storage and performance.
- Performing index operations online by using the LINE index option to provide user access to data during index operations
- Configuring parallel Index operations by using the max degree of parallelism option.
- Using the Query optimizer to develop optimal query plans based on statistics.
- Improving the performance of bulk copy operations on tables with clustered and non clustered indexes.
- Selecting suitable Recovery model of the database for index operations and listing the types of logging available.
Dealing with Fragmented indexes
Though the SQL Server 2005 data base engine automatically maintains indexes whatever the operations made on the database, the modifications can cause a degradation of the index over time. This is turn will degrade query performance. To help the DBA over come the problems of fragmented indexes, SQL Server 2005 provides an option of reorganizing or rebuilding the index. This option can be used on whole indexes or partitioned indexes.
Fragmentation of an index can be recognized by a process of analysis using the sys.dm_db_index_physical_stats function. This function detects fragmentation of a particular index or all indexes in the database. For partitioned indexes, the information is provided for each partition. The results of this operation are displayed in the AvgFragmentation column. The AvgFragmentSize column describes the average number of pages in one fragment in an index. Once the fragmentation size and value is known it has to be evaluated against the parameters for correction provided with SQL Server 2005. If the AvgFragmentation value is <=30% the corrective would be to ALTER INDEX REORGANIZE else the corrective would be to ALTER INDEX REBUILD WITH (ONLINE=ON). The first option is always done online while the second option can be used online or offline. The syntax would be as under:
USE Exforsys;
GO
SELECT IndexName, AvgFragmentation FROM sys.dm_db_index_physical_stats(N’Employeetransact.EmployeeID’, DEFAULT, DEFAULT, N’Detailed’);
GO
Reorganizing an Index
In SQL Server 2005 the ALTER INDEX REORGANIZE statement has replaced the DBCC INDEXDEFRAG statement. A single partition of a partitioned index can be reorganized using this statement.
When an index is reorganized the leaf level of the clustered and non clustered indexes on tables and views are reorganized and reordered to match the logical order—i.e. left to right of the leaf nodes. The index is organized within the allocated pages and if they span more than one file they are reorganized one at a time. No pages are migrated between files. Moreover, pages are compacted and empty pages created as a consequence are removed and the disk space released. The compaction is determined by the fill factor value in sys.indexes catalog view. Large object data types contained in clustered index or underlying tables will also be compacted by default if the LOB clause is present..
The good news is that the reorganize process is economical on the system resources and is automatically performed online. There are no long term blocking locks which jam up the works!
DBAs are advised to reorganize the index when it is minimally fragmented. Heavily fragmented indexes will require rebuilding.
Rebuilding an Index
When an index is rebuilt, it is dropped and a new one is created. In the process fragmentation is removed and disk space is reclaimed. The fill factor setting is used to reorder the pages after compacting in a sequential order. Performance is improved and number of page reads is reduced. The following methods are used to drop and rebuild the index.
1. ALTER INDEX with the REBUILD clause.
2. CREATE INDEX with the DROP_EXISTING clause.
Each of these functions have their own advantages and disadvantages.
Disabling Non-clustered Indexes to Reduce Disk Space During Rebuild Operations
When a rebuilding operation is performed it is a best practice to disable the non-clustered indexes. Diabling the non-clustered index implies that the data rows are deleted but the definition is retained in metadata. The index is enabled after it is rebuilt.
Rebuilding Large Indexes
Indexes which have more than 128 extents must be rebuilt in two phases—the logical and the physical. The logical phase the allocation units of the index are marked for deallocation, the data rows are copied and sorted before they are moved to new allocation units in the rebuilt index. The physical phase involves dropping of the allocation units marked for deallocation in short transaction without locks.
Setting Index Options
When reorganizing an index, the index options cannot be specified as a rule. However, the ALTER INDEX REBUILD and CREATE INDEX WITH DROP_EXISTING allows users set the options such as PAD_INDEX, FILLFACTOR, SORT_IN_TEMPDB, IGONORE_DUP_KEY and STATISTICS_NORECOMPUT etc. Additionally the ALTER INDEX statement allows the specification of ALLOW_PAGE_LOCKS and ALLOW_ROW_LOCKS
Page Splits and Performance Considerations
A page split is a process by which a new row added to a table is some rows in the table are moved to a new page to make room for the new records. This happens during reorganization at the risk of being resource intensive and causing fragmentation. To reduce the risk the fill factor has to be correctly selected. Else the index will have to be frequently rebuilt.
Max Degree of Parallelism
When several processors are used to perform the scan and sort operations on indexes, the number of processors to be used is to be specified. This can be specified in the configuration option max degree of parallelism and also by the current workload. This option limits the number of processors that can be used in parallel. This option is available only for the Enterprise version of the edition.
When the DBA wants to manually configure the number of processors that can be used to run the index statement the MAXDOP index statement is used. This limits the number of processors to be used during an index operation of the query optimizer. This option overrides the max degree of parallelism option for the query. The MAXDOP index option cannot be specified for the ALTER INDEX REORGANIZE statement.
Online Index Operations
Concurrent index operations can be performed during Online index operations. The MAXDOP operation can be used ot control the number of processors dedicated ot online index operations.
Partition Index Operations
Partition index operations can be memory resource intensive if the query optimizer applies degrees of parallelism.
Statistical Information
Statistical information can be created about the distribution of values in a column. This is used by the query optimizer to determine the optimal query plan by estimating the cost of using an index to evaluate a query. These values are sorted by the database engine on which the statistics is being built and a histogram is created for a maximum of 200 values separated by intervals. Additional information is introduced on statistics created on char, varchar, varchar(max), nchar, nvarchar(max), text and ntext columns. This is known as string summary. The string summary helps the query optimizer estimate the selectivity of query predicates on string patterns. This makes for accurate estimates of result set sizes and frequently better query plans.
When the query optimizer is configured to automatically store statistical information about indexed columns, statistics on the columns are automatically generated without indexes that are used in a predicate.
Asynchronous Statistics Updates
The AUTO_UPDATE_STATISTICS_ASYNC option can be used to ensure that the query optimizer is prevented from returning a result set while it waits for the out of date statistics to be updated and compiled. The out of date statistics are put on queue for updating by a worker thread in a background process and the query and the concurrent queries compile immediately. This option is to be set at database level and determines the update method all statistics at the database.
Disabling Automatic Statistics
Automatic statistics can be disabled for a particular column or index by using the sp_autostats system stored procedure or the STATISTICS_NORECOMPUTE clause of the CREATE INDEX statement. There are other clauses also which can be used with the update statement etc to prevent automatic generation of statistics. However, the statistics can be updated manually using the sp_createstats system stored procedure.
Statistics after Upgrading a Database to SQL Server 2005
When the user upgrades the version of SQL server the statistics of the earlier version is treated as out of date. On first use, the statistics will have to be updated using the AUTO_UPDATE_STATISTICS database option.
Bulk copy options and Index operation logging.
Bulk copy options are useful to copy data into a table without a non clustered index. Logging the index operations minimally during this process makes it more efficient and reduces the possibility of the index operation filling the log. However the option depends on whether the table is indexed or not and whether the table is empty or not. If the table is empty, both data and index pages are minimally logged. If the table has no clustered index, data pages are always minimally logged. If the table is empty, the index pages are minimally logged. In non empty tables index pages are fully logged.
In the next section we shall look at the use of the Database Tuning Advisor.