To generate an
HTML or text AWR report:
1. Run the
awrrpt.sql script from within SQL*Plus and logged in with a user having the DBA
role:
@$ORACLE_HOME/rdbms/admin/awrrpt.sql
2. Specify
whether an HTML or a text report is wanted.
Enter
value for report_type: html
3. Specify the
number of previous days’ snapshots that are to be listed.
Enter
value for num_days: 2
4. A list
displaying the snapshot ID and time the snapshot was generated will appear.
Enter the beginning and
ending snapshot
ID for the AWR report.
Enter
value for begin_snap :
Enter
value for end_snap:
5. Enter a
output report name or accept the default name:
Enter
value for report_name:
Using the report name awrrpt_1_150_160
6. The AWR report will be generated.
If you need to diagnose a
specific issue, a snapshot can be manually created just before and then again
just after
executing a questionable program. Typically, this is not necessary.
This example shows how to
manually create a snapshot via SQL*Plus:
BEGIN
DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT
();
END;
/
Analyzing the AWR Report
The response
time analysis consists of finding out where time is being spent in a database.
The
response time of
a given transaction in the database always consists of the following two parts:
Response Time = Service Time +
Wait Time
Where Service
Time is CPU consumption by the database and Wait time is the sum of all the
Wait events in the
database.
The most
important part of the AWR report is the Top 5 Timed Events section. With this
list, you can quickly identify
the main areas
to focus on. When CPU usage is much more significant than Wait Time, it is less
likely that
investigating
Wait Events will produce significant savings in response time. Therefore, you
should compare the time
taken by the top five timed events and direct the tuning
effort to the biggest consumers.
SQL
STATISTICS
• SQL Ordered by Elapsed Time: Includes SQL statements that took
significant execution time during processing.
• SQL Ordered by CPU Time: Includes SQL statements that consumed
significant CPU time during processing.
• SQL Ordered by Gets: These SQLs performed a high number of logical reads
while retrieving data.
• SQL Ordered by Reads: These SQLs performed a high number of physical
disk reads while retrieving data.
• SQL Ordered by Parse Calls: These SQLs experienced a high number of
reparsing operations.
• SQL Ordered by Sharable Memory: Includes SQL statement cursors that
consumed a large amount of SGA shared pool memory.
• SQL Ordered by Version Count: These SQLs have a large number of versions
in shared pool for some reason.
Common wait events
The most common I/O related
wait event is “db file sequential read.” It occurs on a single block read for
index data or table data accessed through an index. If this wait event is high,
then tune it as follows:
Find the Top SQL statements
with Physical Reads in the SQL ordered by Reads section. Generate the explain
plan of the SQL statements.
·
If Index Range scans are involved, more blocks than
necessary could be being visited if the index is unselective. By creating a
more selective index, we can access the same table data by visiting fewer index
blocks (and doing fewer physical I/Os).
·
If indexes are fragmented, again we have to visit
more blocks because there is less index data per block. In this case,
rebuilding the index will compact its contents into fewer blocks.
·
If the index being used has a large Clustering
Factor, then more table data blocks have to be visited in order to get the rows
in each Index block. By rebuilding the table with its rows sorted by the
particular index columns we can reduce the Clustering Factor and hence the
number of table data blocks that we have to visit for each index block. For
example, if the table has columns A, B, C & D and the index is on B, D then
we can export the table data in order by B,D, and reload the table.
If there is no particular
SQL statement with a bad execution plan, then one of the following may be happening:
I/Os on particular datafiles
may be being serviced slower due to excessive activity on their disks. In this
case, looking at the File I/O Statistics section of the AWR report will help
find such hot disks and spread out the I/O by manually moving datafiles to
other storage or by making use of Striping, RAID and other technologies to
automatically perform I/O load balancing for us.
If there is no
SQL with suboptimal execution plans and I/O is evenly spread out with similar
response times from all disks
then a larger Buffer Cache may help. In Oracle Database 11g, the Automatic
Memory Management (AMM) feature is introduced to automatically determine the
size of automatic tuning of PGA and SGA with the use of two new parameters
named MEMORY_MAX_TARGET and MEMORY_TARGET.
Another common
I/O related wait event is “db file scattered read”. It occurs when multiblock
reads from disk are performed into
non-contiguous buffers in the Buffer Cache. Such reads are issued for up to the
number of blocks specified by
this parameter DB_FILE_MULTIBLOCK_READ_COUNT at a time. These typically happen
for Full Table Scans and
for Fast Full Index scans. If this wait event is high then the Top SQL
statements with Physical Reads in the SQL
ordered by Reads section can be investigated to see if their execution plans
contain Full Table or Fast Full Index scans. In cases where such multiblock
scans are necessary, it is possible to tune the size of multiblock I/Os
issued by Oracle by setting the instance parameter DB_FILE_MULTIBLOCK_READ_COUNT to be as below:
DB_BLOCK_SIZE x
DB_FILE_MULTIBLOCK_READ_COUNT = max_io_size of system
Starting with
Oracle10g Release 2, the initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT
is now
automatically
tuned to use a default value of 128 when this parameter is not set explicitly.
This value is platformdependent and is 1MB for
most platforms. Because the parameter is expressed in blocks, it will be set to
a value that is equal to
the maximum I/O size that can be performed efficiently divided by the standard
block size.
Another common
I/O related wait event is “control file parallel write'” It occurs when Oracle
is writing physical blocks to all control files and is waiting for the I/Os to
complete. The details of this wait are reported in the Background Wait Events
section. If system wide waits for this wait event are significant then this
either indicates numerous writes to the control file (too many control files
copies), or slow performance of writes to the control files.
Possible
solutions can be:
Reduce the number of control file
copies to the minimum that ensures that not all copies can be lost at the same time.
Enable asynchronous I/O or move the
control files to faster disks.
Another popular
wait event is “log file sync”. It occurs when a user session issues a
COMMIT and is waiting for LGWR to finish
flushing all redo from the log buffer to disk.
To understand
what is delaying the log file sync, there is a need to examine some other wait
events such as “LGWR wait for
redo copy”, “log file parallel write”, “log file single write” and the redo
statistics.
Some general
tuning tips for this wait event are:
• Move all the log members to high speed
disks
• Move log members to low I/O activities
disk controllers
Starting with
Oracle 10gR2, Oracle introduces an Asynchronous Commit. In Oracle 11g,
asynchronous commit is controlled by
the two initialization parameters COMMIT_LOGGING and COMMIT_WAIT to change the commit
behavior. When Oracle transaction issues a commit, three things happen:
1. Oracle
assigns a system change number (SCN) to the committed transaction
2. The log
writer (LGWR) writes the redo information for the transaction from the redo log
buffer to the redo log files
on disk along with its SCN.
3. Oracle
releases all locks held by the transaction and marks it as complete.
The default
behavior of the Commit is to use IMMEDIATE for COMMIT_LOGGING and WAIT for COMMIT_WAIT.
o Immediate vs. Batch: The IMMEDIATE option tells LGWR to write the redo information
of the committing transaction
immediately to disk. If you want LGWR to buffer redo information until it’s
convenient to write then you can
specify option BATCH.
o Wait vs. NoWait: The WAIT option instructs LGWR to wait until the redo
information is successfully
written to
disk before marking the Commit as completed. Most waits on log file sync event
are caused by this wait. If
you don’t want LGWR to wait for the writing of the redo records and mark the
Commit as completed then you can specify option NOWAIT.
As a general rule, systems
where CPU time is dominant usually need less tuning than the ones where wait
time is dominant. On the other hand, heavy CPU usage could be caused by poor
Sql access paths or badly written SQL so one should not neglect it. In Oracle
11gR2, we saw “Merge Cartesian Join” or “Hash Join” which can be the root cause
of the high CPU usage.
In addition, the proportion
of CPU time to WAIT time always tends to decrease as the load on the system increases.
A steep increase in wait time is a sign of contention and needs to be addressed
for good scalability. A snapshot of the database workload taken periodically
throughout the day is necessary to detect such performance issues.