Programmer Notes | Comprehensive Analysis of Oracle Waiting Event Classification, Discovery and Optimization

I. Origin of Waiting Events

Everyone may be a little strange, why wait for the event, first talked about the index system. In fact, it is the development of the indicator system that leads to the introduction of waiting events. To sum up, Oracle’s index system has roughly gone through the following three stages:

  • With hit rate as the main reference index

Taking various hit rates as the main basis for optimizing entries, the common ones are “library cache hit radio” and so on. However, this method has great disadvantages. A system with a hit rate of 99% is not necessarily better than a 95% system. In the old Oracle version, this method was often used, such as 8i, 9i, etc.

  • Taking waiting events as the main reference index

Based on various waiting events as the basis for optimizing entries, the common ones are “db file sequential read” and so on. Can be more intuitive understanding, in a period of time, the database mainly experienced those waiting. These “bottlenecks” are often the starting points for our optimization. In versions 10g and 11g, it is widely used.

  • Taking the time model as the main reference index

Taking the overall consumption of various resources as the basis for optimizing the entrance. The consumption of the database over a period of time can be understood from an overall perspective. It is more general than waiting for events. Common such as “DB Time”. Oracle is continuously strengthening its work in this area.

From the above three stages, it can be seen that waiting for the introduction of the event is precisely to solve the many drawbacks with hit rate as the index. Compared with the later time model, waiting for events to observe Oracle’s behavior in a more intuitive and fine-grained way is often an important entrance to optimization. However, the time model focuses more on the overall and systematic understanding of the running status of the database. The two have different emphases.

II. Classification of Waiting Events

Let’s start with the classification of waiting events and understand waiting events. Judging from the large classification,Waiting events can be divided into two parts: idle and non-idle. In non-idle waiting events, it can be further divided into fine categories.

The following methods can be used to observe the number and approximate classification of waiting events included in the system (the following statements are run in 11g environment).

The wait event with WAIT_CLASS “Idle” is idle, and the others are all non-idle wait events.

1. Distinguish between idle and non-idle waiting events

Idle wait event, refers to Oracle is waiting for some kind of work, such as after logging in with sqlplus, but did not issue any further commands, at this time the session is in SQL*Net message from/to client waiting for the event state, waiting for the user to issue commands, any diagnosis and optimization of the database, generally do not need to pay too much attention to this part of the event.

Non-idle wait events, specifically for Oracle activities, refers to the database tasks or applications in the process of waiting, these waiting events should be concerned and studied when adjusting the database.

2. Wait for event classification instructions

  • Management class -Administrative

Such waiting events are caused by DBA’s administrative commands that require users to be in a waiting state (e.g., rebuilding indexes).

  • Application class -Application

This type of wait event is caused by the code of the user application (for example, lock wait).

  • Cluster class -Cluster

Such wait events are related to resources that actually apply cluster RAC (for example, gc cr block busy wait events).

  • Submit confirmation class -Commit

This type of wait event contains only one wait event-after executing a commit command, waiting for a redo log write acknowledgement (that is, log file sync).

  • Concurrency class -Concurrency

Such wait events are caused by internal database resources (such as latches).

  • Configuration class -Configuration

Such waiting events are caused by improper configuration of the database or instance (for example, the redo log file size is too small, the size of the shared pool, etc.).

  • Idle class -Idle

This type of wait event means that the session is inactive and waiting for work (for example, sql * net messages from client).

  • Network class -Network

Some waiting events related to the network environment (such as sql* net more data to dblink).

  • Other classes -Other

Such waiting events are usually rare (e.g. wait for EMON to spawn).

  • Scheduling class -Scheduler

Such waiting events are related to resource management (e.g. resmgr: become active’).

  • System I/O class -System I/O

The passage of such waiting events is caused by I/O operations of the background process (such as DBWR waiting for -db file paralle write).

  • User I/O class -User I/O

Such waiting events are usually caused by user I/O operations (such as db file sequential read).

III. Understanding Waiting Events

Each waiting event indicates an active state of the database. From the above query, we can see that there are many waiting events built into the system, and each waiting event can be known through the data dictionary V$EVENT_NAME. The following is a description of the most common waiting event.

This waiting event “db file sequential read”, literally translated as “data file sequential read”, is a waiting event belonging to the “User I/O” class. It is usually a read operation related to a single data block. In most cases, reading an index block or reading a data block through an index will record this wait. This event indicates a large amount of waiting on a single data block, and the high value is usually due to poor join order between tables or the use of non-selective indexes. DB_CACHE_SIZE can determine the frequency of this event by associating this wait with other known problems in statspack reports (such as inefficient sql), by checking to ensure that index scanning is necessary, and by ensuring that the connection order of multi-table connections is adjusted.

The wait event contains three parameters, namely:

  • File#: represents the absolute file number of the file oracle will read
  • Block#: the starting block number of data to be read from this file
  • Blocks: number of blocks read. Usually 1, indicating a single block read.

Through the above parameters, the associated data dictionary can determine the objects that have waiting events (i.e. hot objects have been found). Then, according to different situations, targeted solutions are carried out.

The more you know about waiting events, the deeper you can understand the operation mechanism of the database, thus improving the overall optimization capability. Later, I will introduce some common waiting events.

Four, observe the waiting events

Some views are built into the system. Through these views, the occurrence of waiting events at the overall (system level) and local (session level) level and the classification statistics of various types of events can be known. The following is a description of some major views.

1、v$event_name

Waiting events supported by the system can view information such as the category of waiting events and the meaning of parameters.

2、v$system_wait_class

displays the instance-wide time totals for each registered wait class.

Statistics of waiting event category (system level). From this view, it can be seen from a global perspective that the system has more waiting for such operations.

3、v$system_event

Statistics of waiting events (system level). To expand, it provides a summary of each waiting event since the instance was started. It is often used to obtain historical images of system waiting information. However, by obtaining the increment of waiting items through two snapshot, the waiting items of the system during this period can be determined.

The main fields include:

  • TOTAL_WAITS

The total number of waits for this wait event since the database was started.

  • TIME_WAITED

The total wait time (in hundredths of a second) for this wait event. This data represents the sum of the waiting events in all sessions (including sessions that have ended and are maintaining a connected state) since the database was started.

  • AVERAGE_WAIT

The average wait time (in hundredths of a second) for this wait event.

time_waited/total_waits

  • TOTAL_TIMEOUTS

The total wait timeout for this wait event.

SQL-View Top Level Events by Waiting Time

4、v$session_event

Similar to v$system\_event, it records the cumulative value of each waiting event in a session’s life cycle. Compared with the former, session\_id information has been added. This information is also accumulated into v$system_event at the same time. Note that when a session is reestablished, the statistics will be set to 0.

5、v$session_wait、v$session

Resource or event information that the active session is waiting for. This view was merged with the v$session view in 10g. This is a key view to find performance bottlenecks. It provides what session is currently waiting for in the database under any circumstances. When the system has performance problems, this view can be used as a starting point to indicate the direction to explore the problem.

It should be noted that when waiting no longer exists, the history of waiting that occurred earlier in the session will also disappear, making post-diagnosis very difficult. V$SESSION\_EVENT provides accumulated but not very detailed data. Historical information can be obtained through the historical view v$session_wait_history.

The main fields include:

  • EVENT

The event the session is currently waiting for, or the last waiting event.

  • WAIT_TIME

The time (in hundredths of a second) a session waits for an event.

Value > 0: last wait time (unit: 10ms), currently not in wait state.

Value =0: session is waiting for the current event.

Value =-1: The last waiting time is less than 1 statistical unit and is not currently in the waiting state.

Value =-2: time statistics status is not set to available and is not currently in waiting status.

  • STATE

Wait status (provides explanations for wait_time and second_in_wait fields)

1) waiting:

SESSION is waiting for this event.

2) waited unknown time:

Time information cannot be obtained because timed_statistics value is set to false. It means there was a wait, but the time was short.

3)wait short time:

Indicates that a wait has occurred, but there is no record because the time is very short and does not exceed one time unit.

4)waited knnow time:

If session waits and then gets the required resources, it will enter this state from waiting.

  • WAIT_TIME/SECOND_IN_WAIT

The Wait_time and Second_in_wait field values are related to state.

1)state=waiting

Wait_time is useless, and the second_in_wait value is the actual wait time (in seconds).

2)state=wait unknow time

Wait_time and second_in_wait are useless.

3)state=wait short time

Wait_time and second_in_wait are useless.

4)state=waiting known time

The wait_time value is the actual waiting time (in seconds), and the second_in_wait value is useless.

6、v$session_wait_history

Record the history of the last n waiting events of the session, namely v$session_wait. The default is 10 records, which can be modified.

7、v$event_histogram

This view records the histogram distribution of waiting events so that you can have a better understanding of the specific distribution of a waiting event. What is recorded in the v$session\_event or v$system_event view is cumulative information and average value of waiting, and the time consumed by individual waiting cannot be known.

The following is a summary of the relationship between session waiting events and views:

  • Only one wait event occurs at a time in a session. If you see other waiting events, it only means that waiting has taken place on the next time slice. There is only one wait at a time.
  • The wait\_time and second\_in\_wait fields in v$session\_wait are in seconds, while the time_waited and average_wait fields in v$session_event are in hundredths of a second.
  • After the waiting event of v$session\_wait ends, the statistics of v$session_event will change.
  • The statistical information of v$session_wait is of little significance because the information changes in real time.
  • When the waiting event in v$session\_wait ends, the value of the second _ in _ wait field in v$session_wait is copied to the time \ _ wait field in v$session\_event, and the average_time field in v$session_event view is also modified.

V. Common Waiting Events

Oracle has a lot of waiting events, and different versions also have some differences. Some common waiting events are explained below. I hope it can help us in our daily work.

1、buffer busy waits

Cause of occurrence:

When a session reads data blocks from disk to memory, it needs to find free memory space in memory to store these data blocks. This wait occurs when there is no free space in memory. In addition, there is another case where the session needs to construct a pre-image of the data block at a certain time when doing consistent reading. At this time, it is necessary to apply for memory blocks to store these newly constructed data blocks. If such memory blocks cannot be found in memory, this waiting event will also occur.

Parameter meaning:

  • File#

The id number of the file where the data block is waiting to be accessed.

  • Blocks

Block number waiting to be accessed

  • Id

Before 10g, this value indicates the reason for waiting for an event; After 10g, it indicates the category of waiting events.

Optimization direction: The optimization direction is different depending on the type of waiting event.

  • data block

The general optimization direction is to optimize SQL and reduce logical reading and physical reading. Or reduce the size of stored data in a single block.

  • Data segment header

The general optimization direction is to add FREELISTS and FREELIST GROUPS. Ensure that the gap between FCTFREE and PCTUSED is not too small, thus minimizing FREELIST’s block loops.

  • Revocation block

The general optimization direction is the application program, and the data object is used for peak shifting.

  • Revoke paragraph header

If the database system manages UNDO segments, there is generally no need for intervention. If it is managed by itself, the number of transactions per rollback segment can be reduced.

2、buffer latch

Cause of occurrence:

The storage location of data blocks in memory is recorded in a Hash list. When a session needs to access a data block, it first searches the Hash list, obtains the address of the data block from the list, and then accesses the required data block through the address. oracle will use a latch to protect the integrity of the list. When a session needs to access this list, it needs to obtain a latch. Only in this way can it be guaranteed that this list will not change during the browsing of this session. If the list is too long, the session will take too long to search this list, leaving other sessions waiting. The same data block is frequently accessed, which is what we usually call the hot block problem.

Parameter meaning:

  • latch addr

The virtual address of latch requested by the session in SGA.

  • chain#

The index value in the bufferchainhash list. When the value of this parameter is equal to 0xffffff, it indicates that the current session is waiting for an lrullatch.

Optimization direction:

The optimization direction that can be considered is to create more buffer chains by using multiple buffer pool or to increase the number of latch by using the parameter db_block_lru_latches, so that more sessions can obtain latch. The two methods can be used simultaneously.

3、db file sequential read

Cause of occurrence:

It is usually a read operation related to a single data block. In most cases, reading an index block or reading a data block through an index will record this wait. It may indicate that the tables are not connected in a good order or indexed without selection. For systems with a large number of transactions and good adjustments, this value is mostly normal, but in some cases, it may indicate problems in the system. This wait statistic should be linked to known problems in performance reports, such as inefficient SQL. Check index scans to ensure that each scan is necessary and check the join order of multi-table joins.

DB_CACHE_SIZE is also the determining factor for the frequency of these waits. Problem Hash-area connections should appear in PGA memory, but they also consume a lot of memory, resulting in a lot of waiting when reading sequentially. They may also take the form of direct path read/write waits.

Parameter meaning:

  • file#

Represents the absolute file number of the file oracle is reading

  • block#

The starting block number to read from this file

  • blocks

Number of block read. Usually 1, indicating a single block read.

Optimization direction:

This waiting event does not necessarily mean that there must be a problem. If it can be determined that there is a problem, the following optimization ideas can be followed.

  • Modify the application to avoid sql with large IO or reduce its frequency.
  • Increase data buffer to improve hit rate.
  • Better disk subsystem is adopted to reduce response time of single IO and prevent physical bottleneck.

4、db file scattered read

Cause of occurrence:

This is a waiting event caused by a user operation. When a user issues a SQL operation that requires reading multiple data blocks for each I/O, this waiting event will occur. The two most common cases are full table scanning and index fast scanning. Scattered in this name may lead many people to think that it reads data blocks in a scattered way. In fact, on the contrary, when such waiting events occur, SQL reads data blocks sequentially, such as FTS or IFFS. In fact, scattered here refers to how the read data blocks are stored in memory. After they are read into the memory, they are stored in the memory in a decentralized manner, rather than continuously.

Parameter meaning:

  • file#

Represents the absolute file number of the file oracle is reading.

  • block#

The starting block number to read from this file.

  • blocks

Number of block read.

Optimization direction:

This situation usually shows the wait associated with a full table scan. When full table scans are limited to memory, they rarely enter consecutive buffers but are scattered throughout the buffer memory. If this number is large, it means that the table cannot find indexes or only a limited number of indexes can be found. Although full table scanning may be more effective than index scanning under certain conditions, if such waiting occurs, it is better to check whether these full table scans are necessary.

5、direct path read

Cause of occurrence:

This waiting event occurs when the session reads the data block directly into the PGA instead of SGA. The read data is usually private data of the session, so it is not necessary to put it into SGA as shared data, because it is meaningless to do so. These data usually come from temporary segments, such as SQL sorting data in a session, data generated in the middle of parallel execution, and sorting data generated by Hash join and Merge join. Because these data are only meaningful to SQL operations in the current session, they do not need to be put into SGA. When a direct path read wait event occurs, it means that there is a large amount of temporary data generated on the disk, such as sorting, parallel execution and other operations, or it means that there is insufficient free space in the PGA.

In 11g, a full table scan may use the direct path read method to bypass buffer cache. such a full table scan is a physical read. In 10g, they are all read through gc buffer, so there is no direct path read problem.

Parameter meaning:

  • file#

document number

  • first block#

Start block number read

  • block count

Number of physical blocks read continuously starting from first block

Optimization direction:

With this waiting event, several situations need to be distinguished. One direction is to increase the sorting area and other means, and the other direction is to reduce the amount of IO read or judge whether it is more efficient to read through the buffer area.

6、direct path write

Cause of occurrence:

This occurs when oracle writes data directly from PGA to data files or temporary files, which can bypass SGA. Most common in disk sorting. In this case, the data file with the most frequent operation (if it is sorted, it is likely to be temporary file) should be found to spread the load.

Parameter meaning:

  • file#

document number

  • first block#

Start block number read

  • block count

The number of physical blocks written consecutively starting from the first block.

Optimization Direction: Reduce IO Write Size.

7、library cache lock

Cause of occurrence:

This waiting event occurs when different users compete for resources in the shared pool due to concurrent operations on the same database object. For example, when a user is doing DDL operations on a table, if other users want to access the table, a library cache lock wait event will occur, which cannot continue until DDL operations are completed.

Parameter meaning:

  • Handle address

The address of the loaded object.

  • Lock address

The address of the lock.

  • Mode

The data fragment of the loaded object.

  • Namespace

The name of the namespace of the loaded object in the v$db_object_cache view.

Optimization direction: The optimization direction is to view locked objects and reduce contention.

8、library cache pin

Cause of occurrence:

This wait event, like library cache lock, is a wait event caused by concurrent operations occurring in the shared pool. Generally speaking, if oracle wants to recompile some objects such as pl/sql or views, it needs to pin these objects into the shared pool. If this object is held by another object at this time, a librarycoache pin wait will be generated.

Parameter meaning:

  • Handle address

The address of the loaded object.

  • Lock address

The address of the lock.

  • Mode

The data fragment of the loaded object.

  • Namespace

The name of the namespace of the loaded object in the v$db_object_cache view.

Optimization direction: The optimization direction is to view locked objects and reduce contention.

9、log file sync

Cause of occurrence:

This is a waiting event caused by user session behavior. When a session issues a commit command, the LGWR process writes the redo log generated by the transaction from the log buffer to the disk to ensure that the information submitted by the user is safely recorded in the database. After the session issues a commit instruction, it needs to wait for LGWR to successfully write the redo generated by this transaction to the disk before proceeding with subsequent operations. This waiting event is called log file sync. When there are a large number of log file sync waiting events in the system, it should be checked whether there are users in the database doing frequent submitting operations. This wait event usually occurs on OLTP systems. There are many small transactions in OLTP system. If these transactions are committed frequently, they may cause a large number of log file sync waiting events.

Optimization direction:

  • Improve LGWR performance and use fast disks as much as possible
  • Use bulk submission
  • Appropriate use of options such as nologging/unrecoverable

10、SQL*Net message from client

Cause of occurrence:

Indicates that the foreground server process is waiting for the customer to respond. This waiting event is caused by waiting for the response of the user process. It does not indicate that there is anything abnormal in the database. This wait time will often occur if the network fails.

11、SQL*Net message to client

Cause of occurrence:

This waiting event occurs when the server sends a message to the client. When the server sends a message to the client to wait, the possible reason is that the client is too busy to receive the message from the server in time, or the network problem causes the message not to be sent from the server to the client.

Author: feng han

Source:Yixin Institute of Technology(http://college.creditease.cn/)