Disk
Space Considerations
Additional
temporary disk space is required for online operations. If a clustered index is
created, rebuilt, or dropped online, a temporary non-clustered index is created
to map old bookmarks to new bookmarks.
If the
SORT_IN_TEMPDB option is set to ON, this temporary index is created in tempdb.
If SORT_IN_TEMPDB is set to OFF, the same filegroup or partition scheme as the
target index is used. The temporary mapping index contains one record for each
row in the table, and its content is the union of the old and new bookmark
columns, including uniqueifiers and record identifiers, and including
only a single copy of any column used in both bookmarks.
Online
index operations use row versioning to isolate the index operation from the
effects of modifications made by other transactions. This avoids the need for
requesting share locks on rows that have been read. Concurrent user update and
delete operations during online index operations require space for version
records in tempdb.
Performance
Considerations
Online
index operations are typically slower than equivalent offline index operations
due to the extra steps required to allow online operations. Heavy update
activity may further impede the online index operation as well. Online index
operations are fully logged, potentially leading to decreased performance
compared to bulk-logged operations.
Because
both the source and target structures are maintained during the online index
operation, the resource usage for insert, update, and delete transactions is
increased, potentially up to double the amount of usage. This could cause a
decrease in performance and greater resource usage, especially CPU time, during
the index operation.
Online
index operations may use multiple processors on a multiprocessor system with
SQL Server Enterprise Edition. You can use the MAXDOP index option to
control the number of processors dedicated to the online index operation. In
this way, you can balance the resources that are used by index operation with
those of the concurrent users.
Transaction
Log Considerations
Online
index operations require a large amount of transaction space. The space
required depends on the size of the index. In addition to the index operations,
concurrent online operations also consume log space. From a performance
perspective, it is important to consider the extra transaction log space usage
and pre-size the log file accordingly to avoid expansion during index or user
operations.
For
PeopleSoft applications that have high uptime requirements, online index
operations are recommended. However, they should be scheduled during times of
low user or batch activity to avoid performance degradation.
Statistics
Statistics
are details about the uniqueness (or density) of the data values, including a
histogram consisting of an even sampling of the values for the index key (or
the first column of the key for a composite index) based on the current data.
It also includes the number of pages in the table or index. SQL Server uses a
cost-based optimizer, which means that if the statistics are not relatively
current, it is misleading to the optimizer and can result in poor execution
plans.
AUTO_CREATE_STATISTICS
and AUTO_UPDATE_STATISTICS Options
For
PeopleSoft applications it is recommended that you enable the AUTO_CREATE_ STATISTICS
and AUTO_UPDATE_STATISTICS database options at the database level. When these
options are enabled, SQL Server automatically updates statistics when the query
optimizer determines that they are out of date. The AUTO_UPDATE_STATISTICS
option will automatically update the statistics for a table when a specific
change threshold has been reached. The sysindexes.rowmodctr column
maintains a running total of all relevant modifications to a table. This
counter is updated each time any of the following events occurs:
• A row is
inserted into the table.
• A row is
deleted from the table.
• An
indexed column is updated.
Every
database is created with the database options AUTO_CREATE_STATISTICS and AUTO_UPDATE_STATISTICS
set to TRUE. For PeopleSoft applications, it is recommended that you leave
these options set unless there is a compelling reason to disable them. If the
options must be disabled in exceptional cases, they should only be disabled at
a table level so that they do not affect the entire database.
Disabling
AUTO_UPDATE_STATISTICS at the Table Level
If you do
not want statistics to be automatically updated during the normal operational
hours for a specific table, you can disable the option at the table or index
level. You then take the responsibility of maintaining statistics for that
table or index by explicitly updating statistics. At a table level, the
AUTO_UPDATE_STATISTICS option can be disabled using either the sp_autostats stored
procedure or the UPDATE STATISTICS statement with the WITH NORECOMPUTE option.
Use the sp_autostats
procedure to indicate that statistics should or should not be updated
automatically for a table.
For
example, to disable automatic updating of statistics for all the indexes on the
table PS_BO:
sp_autostats
PS_BO, 'OFF'
For
example, to disable automatic updating of statistics for a specific index on
the table PS_BO:
sp_autostats
PS_BO, 'OFF', PSABO
Alternatively,
use the UPDATE STATISTICS statement with the WITH NORECOMPUTE option. This
indicates that statistics should not be automatically recomputed in the future.
Running UPDATE STATISTICS again without the WITH NORECOMPUTE option enables
automatic updates again. For example:
UPDATE
STATISTICS PS_BO WITH NORECOMPUTE
Note. Setting the AUTO_UPDATE_STATISTICS
database option to FALSE overrides any individual table settings.
User-Created
Statistics
If a
particular column in a table is not a leading column (the first column) in any
indexes of that table, histograms will not be available on that column by
default. If the AUTO_CREATE_STATISTICS database option is set to ON for the
database or table, SQL Server may create statistics and histogram for that
column as needed. Users can also explicitly create statistics on a table
column. This creates a histogram on the first supplied column and associated
density groups (collections) over the supplied column or set of columns, as the
following example demonstrates:
CREATE
STATISTICS BO_FIRST_NAME ON PS_BO_NAME (FIRST_NAME)
Statistics
are used by the query optimizer to estimate the selectivity of expressions, and
thus the size of intermediate and final query results. Good statistics allow
the optimizer to accurately assess the cost of different query plans and choose
a high-quality plan.
User-created
statistics are required for very few advanced performance tuning scenarios. The
statistics created by SQL Server are usually sufficient for the optimizer to
produce efficient execution plans.
Updating
Statistics
Statistics
can be manually updated any time using the UPDATE STATISTICS statement. This
statement updates information about the distribution of key values for one or
more statistics groups (collections) in the specified table.
The
following example updates statistics for all the indexes on PS_BO table, using
a default sampling:
UPDATE
STATISTICS PS_BO
Usually a
default sampling is good enough. However, there were few occasions during
tuning and benchmarking of a PeopleSoft application that the SQL Server
optimizer failed to produce the best execution plan for some SQL statements.
Further
testing showed that updating statistics with FULLSCAN improved the situation.
The
following example updates statistics for all the indexes on PS_BO table, using
a specific sampling:
UPDATE
STATISTICS PS_BO WITH SAMPLE 40 PERCENT
The
following example updates statistics for all the indexes on PS_BO table, using
all the rows:
UPDATE
STATISTICS PS_BO WITH FULLSCAN
Note. Use UPDATE STATISTICS with FULLSCAN
as an exceptional situation, if you believe the optimizer is not picking up the
best execution plan based on the existing statistics.
Statistics
can also be updated on all user-defined tables in the current database, using
the sp_updatestats stored procedure. This may take a very long time to
complete. For example:
USE PSFTDB
EXEC
sp_updatestats
Viewing
Statistics
The DBCC
SHOW_STATISTICS command reports the distribution statistics for a specified
indexed or non-indexed column.
The report
contains the following useful information:
• Date and
time that statistics were collected.
• Number of
rows in the table.
• Rows that
were sampled.
• Number of
steps in the histogram.
• Density
for non-frequent column values.
• Average
key length.
• All
density.
•
Distribution histogram.
The
following is an example of the DBCC SHOW_STATISTICS command and its output: