Yixin-Operation and Maintenance-Database |SQL Optimization: An Article Clarifies the Correct Use Posture of Oracle Hint

  Database, oracle

I. Hint Overview

1. Why is Hint introduced?

Hint is a unique function in Oracle database, and is often used in many DBA optimizations. Then why would Oracle consider introducing an optimizer? Cost-based optimizer is very smart. In most cases, it will choose the right optimizer to reduce the burden of DBA.

However, sometimes it is also clever and mistakenly chosen a very poor execution plan, making the execution of a certain statement extremely slow. At this time, DBA is required to intervene artificially and tell the optimizer to use the specified access path or connection type to generate the execution plan, so as to make the statement run efficiently. Hint is a mechanism provided by Oracle to tell the optimizer to generate an execution plan in the way it is told.

2. Don’t rely too much on Hint

In case of poor SQL execution plan, priority should be given to statistical information and other issues, instead of directly adding Hint. If the statistical information is correct, it should be considered whether the physical structure is reasonable, that is, there is no suitable index. Hint is only considered when SQL cannot be executed according to the optimized execution plan.

After all, using Hint requires the application system to modify the code. Hint can only solve the problem of one SQL, and due to changes in data distribution or other reasons (such as index renaming), SQL will have another performance problem.

3. The disadvantages of Hint

  • Hint is a relatively “violent” solution, not very elegant. Requires developers to manually modify the code.
  • Hint will not adapt to the new changes. For example, significant changes have taken place in data structure and data scale, but the statement using Hint is to sense the changes and produce a better execution plan.
  • Hint may have some differences or even be abandoned with the changes of database version. At this point, the statement itself is imperceptible and must be tested and corrected manually.

4. Relationship between Hint and Annotation

Hint is an extended function provided by Oracle in order not to destroy the compatibility of SQL statements with other database engines. Oracle decided to add the prompt as a special comment. Its particularity is that the prompt must be followed by DELETE, INSERT, UPDATE or MERGE keywords.

In other words, hints cannot be added anywhere in SQL statements like normal comments. And the first character after the comment delimiter must be a plus sign. It will be explained in detail later in the usage section.

5. Hint function

Hint provides rich functions and can flexibly adjust the execution process of statements. Through Hint, we can adjust:

  • Optimizer type
  • Optimizer optimization objective
  • Data Reading Method (Access Path)
  • Query conversion type
  • The order of association between tables
  • Type of association between tables
  • Parallel characteristic
  • Other characteristics

Hint usage

1. Grammar

1) Keyword Description

  • DELETE, INSERT, SELECT, and UPDATE are keywords that identify the beginning of a statement block. Comments containing prompts can only appear after these keywords, otherwise the prompts are invalid.
  • The “+”sign indicates that the comment is a prompt, and the plus sign must immediately follow the “/*”, with no space in the middle.
  • Hint is one of the specific prompts described below. If there are multiple prompts, each prompt needs to be separated by one or more spaces.
  • Text is another explanatory text describing hint

2) Error in Prompt

The syntax error in the prompt will not be reported as an error. If the parser cannot parse it, it will be treated as a normal annotation. This is also a point that is easy to cause confusion. Is the Hint actually effective? Some measures can be adopted to check the validity of the prompt. It should be noted that those hints with correct syntax but incorrect references will not be reported.

  • explain plan + dbms_xplan

Use the note option in dbms_xplan output.

  • 10132 event

In 10g, the output document generated by this event has a portion of content at the end that is dedicated to hints. Through it, two aspects can be checked: first, each hint used will be listed. If there is one missing, it means that the prompt has not been recognized. The second is to check whether there is some information indicating that there is a prompt error (if there is an error, err value will be greater than 0).

3) Objects in Prompt

SELECT /+ INDEX(table_name index_name)/ …

  • Table_name must be written, and if the alias of the table is used in the query, the alias of the table should be used in hint instead of the table name.
  • Index_name may not need to be written, Oracle will select an index according to the statistics.
  • If the index name or table name is written incorrectly, this hint will be ignored.

If the specified object is a view, you need to specify it this way. /*+hint view.table …*/, where table is the table in view.

A very common mistake, when using prompts, the most common mistake is related to the alias of the table. The correct rule is that when tables are used in prompts, aliases should be used instead of table names as long as the tables have aliases.

2. Scope of Prompt

  • Query block

Initialization parameter hints work for the entire SQL statement, while other hints only work for query blocks. Prompts that only work on a single query block must be specified within the query block it controls.

  • Exception-Global Prompt

You can use dot numbers to refer to objects contained in other query blocks (assuming these blocks are named). The syntax of global prompt can support more than two levels of references. Objects must be separated by dot numbers.

  • Named query block

Since subqueries in the where clause are unnamed, their objects cannot be referenced by global prompts. In order to solve this problem, 10g uses another method to solve-named query block. The query optimizer can generate a query block name for each query, and can also manually name each query block using the prompt qb_name. Most hints can be specified by parameters to be valid in that query block.

* Use @ to refer to a query block in the prompt.

3. Prompt Data Dictionary

Oracle has provided a data dictionary-V $ SQL _ HINT in the 11g release. Through this data dictionary, you can see the appearance version of prompts, summary data version, SQL features and opposite prompts, etc.

  • INVERSE

This hint is the opposite of hint.

  • VERSION

Representing the officially released version of this hint.

Hint classification

1. Related to optimizer

When you are not satisfied with the basic execution plan made by the optimizer for a statement, the best way is to convert the optimizer’s mode through prompts and observe the converted results to see if the desired level has been reached. If a very good execution plan can be obtained only by changing the optimizer’s mode, there is no need to use more complicated prompts.

  • OPT_PARAM

The function of this prompt is to make us specify a system parameter value in a statement.

  • ALL_ROWS

In order to realize the overall optimization of query statements, the optimizer is guided to make an execution plan with the least cost. This prompt will enable the optimizer to choose a path that can retrieve all query rows as quickly as possible, at the cost of being slow in retrieving one row of data.

  • FIRST_ROWS

In order to obtain the best response time, guide the optimizer to make an execution plan with the lowest cost. This prompt will enable the optimizer to choose the path that can retrieve the data in the first row (or specified row) of the query as quickly as possible, at the cost of slow retrieval of many rows. The number of rows optimized by FIRST_ROWS, with a default value of 1, is between 10 and 1000. This new method using FIRST_ROWS(n) is completely cost-based. It is sensitive to N. If the value of N is very small, CBO will generate a plan containing nested loops and index searches. If n is large, CBO generates a plan consisting of hash joins and full table scans (similar to ALL_ROWS).

  • CHOOSE

Whether to use RBO or CBO is decided according to the presence or absence of statistical information of tables used in SQL. In CHOOSE mode, if the statistical information of the table can be referenced, it will be executed according to ALL_ROWS mode. Unless all tables in the query have not been analyzed, the choose hint uses cost-based optimization for the entire query. If one table in a multi-table join has been analyzed, the entire query will be optimized based on cost.

  • RULE

The rule-based optimizer is used to realize optimal execution, that is, the optimizer is guided to determine the execution order of the indexes or operators used in the query criteria according to the priority rule to formulate the execution plan. This prompt forces oracle to use a predefined set of rules first, rather than counting data; At the same time, the prompt will also prevent the statement from using other prompts except DRIVING_SITE and ORDERED (both prompts can be used regardless of whether rule-based optimization is performed or not).

2, and access path related

  • FULL

Tell the optimizer to access the data by full table scanning. This prompt scans only the specified table, not all the tables in the query. FULL hints can improve performance. This is mainly because it changes the drive table in the query, not because of full table scanning. You must also use the FULL prompt when using some other prompts. Only when the entire table is accessed can the table be CACHEd using cache hints. Some hints in parallel groups must also use full table scanning.

  • CLUSTER

The bootstrap optimizer reads data from the index table by scanning the clustered index.

  • HASH

Boot the optimizer to read data from the table by hash scanning.

  • INDEX

Tells the optimizer to access data by indexing the specified table. When accessing data will result in incomplete result sets, the optimizer will ignore this Hint.

  • NO_INDEX

Tells the optimizer that indexes are not allowed on the specified table. This prompt prevents the optimizer from using the specified index. You can disable indexes in many queries before deleting unnecessary indexes. If NO_INDEX is used but no index is specified, a full table scan is performed. If NO_INDEX and conflicting hints (such as INDEX) are used for an index at the same time, both hints will be ignored.

  • INDEX_ASC

When reading data from a table using an index, the bootstrap optimizer scans the index column values of the index specified in the hint using the range in ascending order.

  • INDEX_COMBINE

Tell the optimizer to force bitmap index selection. This hint causes the optimizer to merge multiple bitmap INDEXes on the table instead of selecting the best one (this is the purpose of the index hint). You can also use INDEX_combine to specify a single index (this hint takes precedence over the index hint for a specified bitmap index). For B-tree indexes, the AND_EQUAL prompt can be used instead of this prompt.

  • INDEX_JOIN

Index association, when there are indexes on the columns referenced in the predicate, the data can be accessed through index association. This hint can merge different indexes of the same table, so that only these indexes need to be accessed, thus saving the time to query back to the table. However, the prompt can only be used in a cost-based optimizer. This hint not only allows access to only the index on the table, which can scan fewer code blocks, but it is also 5 times faster than using the index and scanning the entire table through rowid.

  • INDEX_DESC

When reading data from a table using an index, the bootstrap optimizer scans the index column values of the index specified in the hint in descending order using the range.

  • INDEX_FFS

Tell the optimizer to access the data in the way of INDEX FFS(index fast full scan). The INDEX_FFS prompt performs a fast global scan of the index. This prompt only accesses the index, not the corresponding table. This prompt is only used when all the information that the query needs to retrieve is on the index. Especially when the table has many columns, using this hint can greatly improve performance.

  • INDEX_SS

Force index skip scan to access the index. When some predicate conditions are not in the first column of the federated index in a federated index (or when the predicate is not in the first column of the federated index), the index skip scan can be used to access the index to obtain data. This method is more efficient than full table scanning when the unique value of the first column of the federated index is very small.

3. Related to query transformation

  • USE_CONCAT

The query sentences connected with multiple OR or IN operators are decomposed into multiple single query sentences, and the optimized query paths are selected for each single query sentence, and then these optimized query paths are combined together to realize the optimization purpose of the whole query sentence. This prompt can only be used if the driver query criteria includes OR.

  • NO_EXPAND

Guide the optimizer not to make a combined execution plan for conditions that use the OR operator (or the IN operator). This is exactly the opposite of USE_CONCAT.

  • REWRITE

In order to improve the execution speed, materialized views can be created in advance when the objects connected to the tables are tables with large amount of data or the results processed by statistical functions need to be obtained. When a user requests to query a query statement, the optimizer will choose a more effective method to read data from the table or from materialized views. This execution method is called query rewriting. Use REWRITE prompts to guide the optimizer to execute in this way.

  • MERGE

In order to read data from views or nested views in an optimal way, the base table data used by views is directly read by transforming query statements. This process is called view merging. The specific types of use are also different in different situations. This prompt is mainly used when views are not merged. Especially for more complex views or nested views (such as views using GROUP BY or DISTINC), using this prompt can sometimes achieve very good results.

  • UNNEST

Prompts the optimizer to convert the subquery into a connection. That is, guide the optimizer to merge the subquery and the main query and convert them to connection types.

  • NO_UNNEST

Guide the optimizer to make the subquery FILTER with the peripheral query after the subquery can be executed independently.

  • PUSH_PRED

Use this prompt to push query criteria other than views or nested views into the view.

  • NO_PUSH_PRED

Use this prompt to ensure that query criteria other than views or nested views are not pushed inside the view.

  • PUSH_SUBQ

Use this hint to guide the optimizer to make an execution plan for subqueries that cannot be merged. After a sub-query that cannot be merged is executed preferentially, the execution result of the sub-query will play the role of a provider that reduces the query scope of the main query data. Usually, subqueries play the role of inspectors when subquery merging cannot be performed, so subqueries are usually executed last. If a subquery that cannot be merged has fewer result rows, or if the subquery can reduce the query scope of the main query, the prompt can be used to guide the optimizer to execute the subquery in front to the greatest extent to improve the execution speed. However, if the subquery is executing a remote table or part of the join results of a sort merge join, the prompt will have no effect.

  • NO_PUSH_SUBQ

Use this hint to guide the optimizer to put subqueries that cannot be merged into the final execution. When subqueries cannot reduce the query scope of the main query or the execution of subqueries is expensive, putting such subqueries in the final execution can improve the overall execution efficiency to some extent. In other words, use other query criteria as much as possible to minimize the scope of the query before executing the subquery.

4, and table connection order related

These tips can adjust the order of table joins. These hints are not the only ones that can be used to adjust the order of table joins. In nested loop joins, hints can also be used to guide the optimizer to use indexes created by driving query criteria. However, this method is more effective only when the order of the indexes and tables used are adjusted at the same time. In general, these tips are mainly used when performing multi-table join and join order between tables is relatively chaotic, and are also used when ordering merge join or hash join to guide the optimizer to preferentially execute tables with relatively small amount of data.

  • LEADING

In a query associated with multiple tables, this Hint specifies which table is the driving table, that is, it tells the optimizer to access the data on that table first. The bootstrap optimizer uses the table specified by LEADING as the first table in the table join order. This prompt is not related to the order of the tables described in FROM, nor is it different FROM the ORDERED prompt that is used to adjust the join order of the tables, and the order of the tables described in FROM does not need to be adjusted when using this prompt. When the prompt is used simultaneously with the ORDERED prompt, the prompt is ignored.

This prompt is similar to the ORDERED prompt, which allows you to specify the table that drives the query, and then the optimizer determines the next table to access. If you use this prompt to specify multiple tables, you can ignore this prompt.

  • ORDERED

The bootstrap optimizer performs joins in the order of the tables described in FROM. If used with the LEADING hint, the LEADING hint will be ignored. Since ORDERED can only adjust the order of table joining and cannot change the way of table joining, USE_NL, USE_MERGE prompts are often used together with ORDERED prompts in order to change the way of table joining.

5. Related to Table Join Operation

  • USE_NL

Use this hint to guide the optimizer to perform table joins in nested loop joins. It just points out the way the tables are joined and has no effect on the order of joining the tables.

  • USE_MERGE

Directs the optimizer to perform the connection according to the sort merge connection method. If necessary, it is recommended to use this prompt with the ORDERED prompt. Hints are usually used to obtain the best throughput for queries. Assuming that the two tables are joined together, the row sets returned from each table will be sorted and then merged (i.e. merged sort) to form the final result set. Since each row is sorted before merging, the retrieval of all rows in a given query will be the fastest. If you need to return to the first row as quickly as possible, you should use the USE_NL prompt.

  • USE_HASH

This prompt guides the optimizer to perform the connection according to the hash connection method. When performing hash join, if hash join can be realized in memory because the table on one side is relatively small, then very good execution speed can be obtained. Because in most cases the optimizer will determine Build Input and Prove Input through analysis of statistical information, it is not recommended to use the ORDERED prompt to arbitrarily change the join order of tables. However, this prompt can be used when the optimizer fails to make a correct judgment or when there is no statistical information like the result set obtained from nested views.

6, and parallel related

  • PARALLEL

Specifies the parallelism of SQL execution, which will override the parallelism set by the table itself. If this value is default, CBO uses the system parameter. This prompt is used when reading large amounts of data from a table and performing DML operations to specify parallel operations for SQL.

In general, you need to specify the number of parallel threads to be used in this prompt. If the number of parallelism is not specified in the prompt, the optimizer will automatically calculate using the value specified by the PARALLEL_THREADS_PER_CPU parameter. If PARALLEL is specified when defining the table, the optimizer will select parallel operations according to the specified level of parallelism even if the prompt is not used when parallel operations can be used.

However, if you want to use parallel operations in DML operations such as DELETE, INSERT, UPDATE, MERGE, you must set ALTER SESSION ENABLE PARALLEL DML in the session. The parallelism level set in a session can also be referenced in internal GROUP BY or sort operations. If a restriction element appears in parallel operation, the prompt will be ignored.

  • NOPARALLEL/NO_PARALLEL

Parallelism is prohibited in SQL statements. In some versions, NO_PARALLEL hint is used instead of NOPARALLEL hint.

  • PQ_DISTRIBUTE

In order to improve the execution speed of parallel connection, this prompt is used to define which method is used to allocate data rows of each connection table between master and slave processes (e.g. producer process and consumer process).

  • PARALLEL_INDEX

This hint is used to scan the index range of the partitioned index in parallel operation, and the number of processes can be specified.

7. Other Relevant

  • APPEND

Let the database load the data into the warehouse in a direct load mode. This prompt does not check whether there is currently any block space required for insertion, instead it directly adds data to the new block. This wastes space, but improves insertion performance. It should be noted that the data will be stored at a location above the HWM.

  • APPEND_VALUES

In 11.2, Oracle added APPEND_VALUES hint so that INSERT INTO VALUES statement can also be inserted using direct path.

  • CACHE

After the full table scan, the data block will remain at the most active end of the LRU list. If the CACHE attribute of the table is set, its function is the same as HINT. This prompt caches all table scans in memory. If the table is large, it will take up a lot of memory. Therefore, it is suitable for smaller tables frequently accessed by users.

  • NOCACHE

The boot optimizer caches the data blocks obtained by full table scanning in the last position of the LRU list, so that these data blocks in the database instance cache can be preferentially cleared. This is the optimizer’s default method of managing data blocks in Buffer Cache (only for full table scans).

  • QB_NAME

The prompt is used to name the query statement block, and the name of the query statement block can be directly used in other query statement blocks.

  • DRIVING_SITE

This hint is useful in distributed database operations. Specifies that the table is where the connection is processed. The amount of information processed through the network can be limited. In addition, local views of remote tables can be established to limit rows retrieved from remote sites. The local view should have a where clause so that the view can restrict the rows returned from the remote database before sending them back to the local database.

  • DYNAMIC_SAMPLING

Prompts the level of dynamic sampling during SQL execution. This level is 0~10, which will override the system default dynamic sampling level. The higher the grade, the higher the accuracy of the obtained statistical information. The function of this prompt is to ensure that the dynamic sampling principle is applied to a single SQL.

  • AND_EQUAL

This hint causes the optimizer to merge multiple INDEXes on the table instead of selecting the best of them (this is the purpose of the index hint). This hint is different from the previous INDEX_JOIN hint, so the specified merge index needs to access the table later, while the INDEX_JOIN hint only needs to access the index. If you find that you need this prompt frequently, you may need to delete these individual indexes and use a combined index instead. You need to include all index columns in the query criteria, and then get the rowid list obtained in each index. Then merge join these objects, filter out the same rowid, and then go to the table to obtain data or directly obtain data from the index. In 10g, and_equal has been abandoned and can only take effect through hint.

  • CARDINALITY

The optimizer is provided with a prediction base value for the whole or part of a certain query statement, and an execution plan is made for the query statement by referring to the base value. If the name of the table is not specified in the prompt, the base value will be treated as the number of final result rows obtained from the query statement.

IV. Hint Use Examples

The following is an example to illustrate the use of prompts and under what circumstances prompts will be ignored.

1, build a table

2. Use INDEX Tips

* In some cases, if CBO thinks Hint will lead to wrong results, Hint will ignore it. In this example, because the ID field may be empty and the index is empty, the use of the index by count(*) will result in wrong results, so full table scanning is used, ignoring Hint.

3. use INDEX prompt (non-empty field)

*ID field cannot be empty, so COUNT can be processed by index scanning, Hint is effective.

Author: feng han

Source:Yixin Institute of Technology(http://college.creditease.cn/)