My Basic Knowledge: Generating an HTML or Text Oracle AWR Report

To generate an HTML or text AWR report:

1. Run the awrrpt.sql script from within SQL*Plus and logged in with a user having the DBA role:

@$ORACLE_HOME/rdbms/admin/awrrpt.sql

2. Specify whether an HTML or a text report is wanted.

Enter value for report_type: html

3. Specify the number of previous days’ snapshots that are to be listed.

Enter value for num_days: 2

4. A list displaying the snapshot ID and time the snapshot was generated will appear. Enter the beginning and

ending snapshot ID for the AWR report.

Enter value for begin_snap :

Enter value for end_snap:

5. Enter a output report name or accept the default name:

Enter value for report_name:

Using the report name awrrpt_1_150_160

6. The AWR report will be generated.

If you need to diagnose a specific issue, a snapshot can be manually created just before and then again just after

executing a questionable program. Typically, this is not necessary.

This example shows how to manually create a snapshot via SQL*Plus:

BEGIN

DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

END;

Analyzing the AWR Report

The response time analysis consists of finding out where time is being spent in a database. The

response time of a given transaction in the database always consists of the following two parts:

Response Time = Service Time + Wait Time

Where Service Time is CPU consumption by the database and Wait time is the sum of all the Wait events in the

database.

The most important part of the AWR report is the Top 5 Timed Events section. With this list, you can quickly identify

the main areas to focus on. When CPU usage is much more significant than Wait Time, it is less likely that

investigating Wait Events will produce significant savings in response time. Therefore, you should compare the time

taken by the top five timed events and direct the tuning effort to the biggest consumers.

SQL STATISTICS

• SQL Ordered by Elapsed Time: Includes SQL statements that took significant execution time during processing.

• SQL Ordered by CPU Time: Includes SQL statements that consumed significant CPU time during processing.

• SQL Ordered by Gets: These SQLs performed a high number of logical reads while retrieving data.

• SQL Ordered by Reads: These SQLs performed a high number of physical disk reads while retrieving data.

• SQL Ordered by Parse Calls: These SQLs experienced a high number of reparsing operations.

• SQL Ordered by Sharable Memory: Includes SQL statement cursors that consumed a large amount of SGA shared pool memory.

• SQL Ordered by Version Count: These SQLs have a large number of versions in shared pool for some reason.

Common wait events

The most common I/O related wait event is “db file sequential read.” It occurs on a single block read for index data or table data accessed through an index. If this wait event is high, then tune it as follows:

Find the Top SQL statements with Physical Reads in the SQL ordered by Reads section. Generate the explain plan of the SQL statements.

· If Index Range scans are involved, more blocks than necessary could be being visited if the index is unselective. By creating a more selective index, we can access the same table data by visiting fewer index blocks (and doing fewer physical I/Os).

· If indexes are fragmented, again we have to visit more blocks because there is less index data per block. In this case, rebuilding the index will compact its contents into fewer blocks.

· If the index being used has a large Clustering Factor, then more table data blocks have to be visited in order to get the rows in each Index block. By rebuilding the table with its rows sorted by the particular index columns we can reduce the Clustering Factor and hence the number of table data blocks that we have to visit for each index block. For example, if the table has columns A, B, C & D and the index is on B, D then we can export the table data in order by B,D, and reload the table.

If there is no particular SQL statement with a bad execution plan, then one of the following may be happening:

I/Os on particular datafiles may be being serviced slower due to excessive activity on their disks. In this case, looking at the File I/O Statistics section of the AWR report will help find such hot disks and spread out the I/O by manually moving datafiles to other storage or by making use of Striping, RAID and other technologies to automatically perform I/O load balancing for us.

If there is no SQL with suboptimal execution plans and I/O is evenly spread out with similar response times from all disks then a larger Buffer Cache may help. In Oracle Database 11g, the Automatic Memory Management (AMM) feature is introduced to automatically determine the size of automatic tuning of PGA and SGA with the use of two new parameters named MEMORY_MAX_TARGET and MEMORY_TARGET.

Another common I/O related wait event is “db file scattered read”. It occurs when multiblock reads from disk are performed into non-contiguous buffers in the Buffer Cache. Such reads are issued for up to the number of blocks specified by this parameter DB_FILE_MULTIBLOCK_READ_COUNT at a time. These typically happen for Full Table Scans and for Fast Full Index scans. If this wait event is high then the Top SQL statements with Physical Reads in the SQL ordered by Reads section can be investigated to see if their execution plans contain Full Table or Fast Full Index scans. In cases where such multiblock scans are necessary, it is possible to tune the size of multiblock I/Os issued by Oracle by setting the instance parameter DB_FILE_MULTIBLOCK_READ_COUNT to be as below:

DB_BLOCK_SIZE x DB_FILE_MULTIBLOCK_READ_COUNT = max_io_size of system

Starting with Oracle10g Release 2, the initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT is now

automatically tuned to use a default value of 128 when this parameter is not set explicitly. This value is platformdependent and is 1MB for most platforms. Because the parameter is expressed in blocks, it will be set to a value that is equal to the maximum I/O size that can be performed efficiently divided by the standard block size.

Another common I/O related wait event is “control file parallel write'” It occurs when Oracle is writing physical blocks to all control files and is waiting for the I/Os to complete. The details of this wait are reported in the Background Wait Events section. If system wide waits for this wait event are significant then this either indicates numerous writes to the control file (too many control files copies), or slow performance of writes to the control files.

Possible solutions can be:

Reduce the number of control file copies to the minimum that ensures that not all copies can be lost at the same time.

Enable asynchronous I/O or move the control files to faster disks.

Another popular wait event is “log file sync”. It occurs when a user session issues a COMMIT and is waiting for LGWR to finish flushing all redo from the log buffer to disk.

To understand what is delaying the log file sync, there is a need to examine some other wait events such as “LGWR wait for redo copy”, “log file parallel write”, “log file single write” and the redo statistics.

Some general tuning tips for this wait event are:

• Move all the log members to high speed disks

• Move log members to low I/O activities disk controllers

Starting with Oracle 10gR2, Oracle introduces an Asynchronous Commit. In Oracle 11g, asynchronous commit is controlled by the two initialization parameters COMMIT_LOGGING and COMMIT_WAIT to change the commit behavior. When Oracle transaction issues a commit, three things happen:

1. Oracle assigns a system change number (SCN) to the committed transaction

2. The log writer (LGWR) writes the redo information for the transaction from the redo log buffer to the redo log files on disk along with its SCN.

3. Oracle releases all locks held by the transaction and marks it as complete.

The default behavior of the Commit is to use IMMEDIATE for COMMIT_LOGGING and WAIT for COMMIT_WAIT.

o Immediate vs. Batch: The IMMEDIATE option tells LGWR to write the redo information of the committing transaction immediately to disk. If you want LGWR to buffer redo information until it’s convenient to write then you can specify option BATCH.

o Wait vs. NoWait: The WAIT option instructs LGWR to wait until the redo information is successfully

written to disk before marking the Commit as completed. Most waits on log file sync event are caused by this wait. If you don’t want LGWR to wait for the writing of the redo records and mark the Commit as completed then you can specify option NOWAIT.

As a general rule, systems where CPU time is dominant usually need less tuning than the ones where wait time is dominant. On the other hand, heavy CPU usage could be caused by poor Sql access paths or badly written SQL so one should not neglect it. In Oracle 11gR2, we saw “Merge Cartesian Join” or “Hash Join” which can be the root cause of the high CPU usage.

In addition, the proportion of CPU time to WAIT time always tends to decrease as the load on the system increases. A steep increase in wait time is a sign of contention and needs to be addressed for good scalability. A snapshot of the database workload taken periodically throughout the day is necessary to detect such performance issues.