The slow SQL database monitoring function of UAVStack and its implementation.

  Database, Monitoring

Author: Wang Linlin
Source: UAVStack intelligent operation and maintenance
Source:Yixin Institute of Technology
Technology Salon 001 |AI Middle Station: An Agile Intelligent Business Support Scheme | Yixin Technology Salon Live Online at 8pm on March 28, Click to Sign Up

UAVStack is a full-scale monitoring and application operation and maintenance platform. UAV.Monitor has monitoring functions, including basic monitoring, application/service performance monitoring, log monitoring, business monitoring, etc. In application monitoring, UAV can draw pictures according to application examples. The application instance component can paint logs, services, clients, etc. Client-based portraits are divided into Http, Dubbo, MQ, Kafka, JDBC, Redis, MongoDB, etc.

I. background

As a programmer or operation and maintenance personnel who has worked for many years, I believe you must have encountered the following situations:

Scenario 1: There is an abnormal situation in the system, and the operation and maintenance personnel failed to find it at the first time. Instead, the business party feedback the system crash and the page cannot open during the use process. Check the system log and find that the connection database has been reported abnormal.

Scene 2: After the new function has been on-line and running stably for a period of time, the response of the user feedback page becomes slower and slower, and it takes a long time to open a page. After troubleshooting, it was found that a slow SQL affected the whole function experience.

To this end, UAVStack has developed a database monitoring function. At first, the database monitoring function only collected indexes of data sources and database connection pools, and real-time database connection pool information and operation counts can be viewed through client portraits. Recently, UAVStack has unlocked a new function-slow SQL monitoring, which makes the database monitoring function more perfect.

Today, Xiao Bian will introduce to you the concrete implementation of database monitoring. The following keywords appearing in the article are all replaced by abbreviations:

Middleware Enhancement Framework: English MonitorFramework, short for MOF

Health management service: English HealthManager, abbreviated as HM

MonitorAgent: English monitor agent, MA for short

II. Key Technology &UAV Self-Research Framework

MOF Agent injection mechanism: mofagent injection mechanism is based on Java agent and java assistant technology. Java agent is responsible for intercepting and converting byte code streams. Javaassist is used for analysis and modification in the conversion process. Cut points are injected at key positions in the application server’s life cycle to provide a basis for MOF framework initialization, application portrait information and real-time monitoring data information capture.

Intercept framework framework: rewrite bytecode in the specific life cycle of application startup, implant specific logic processing code, i.e. portrait data collection, the collected data includes service portrait and client portrait; The client portrait contains common open source components such as Http, Dubbo, MQ, Kafka, JDBC, Redis, MongoDB, etc. The third-party services invoked in the system are all listed as objects of the client. For example, the interfaces that invoke the third-party system in the system belong to the category of the client.

CaptureFramework Framework: rewrite bytecode in a specific life cycle through the InterceptFramework framework and implant specific logic codes. In the embedded logic codes, data can be collected and stored through the capabilities of the Capture Framework portrait Monitor capture system. The specific implementation is to use doCapture to capture data at a specific capture point, use doPreStore to capture some data before storing the data structure, process the captured data with special data, call the specific Supporter through UAVServer after obtaining the processed data, and finally realize data landing.

Iii. components

The implementation of slow SQL monitoring is divided into four components:

Dynamic start-stop of slow SQL: the monitoring start-stop of slow SQL depends on MOF’s Global Filter mechanism. During application initialization, UAV rewrites the applied Filter and provides an interface for issuing instructions to MOF. As long as the calling interface passes in the specified parameters, the dynamic start-stop of slow SQL monitoring can be realized. There is no direct call to MOF by other services in the UAV system, which is completed through MA. You can understand MA as the medium between the service requester and MOF.

Slow SQL data acquisition: Rely on the InterceptFramework framework to rewrite bytecode and implant specific logic in specific life cycle, and at the same time use MOF’s CaptureFramework framework to capture data and generate capture results. MA will collect the generated files regularly and package them into a fixed data structure to send to MQ.

Data storage: create an independent feature in HM for data processing, consume data pushed by MA to MQ, complete data cleaning and then store to ES. Due to the special agreement of the data collection results, the data obtained from MQ cannot be directly converted into corresponding results, and the corresponding analysis processing is required for storage (due to the fact that the collected data contains many fields and may contain special characters, the analysis of the data will be affected, and the correct analysis of the data can be realized only when there are rules and constraints when generating the data results). Feature of database monitoring also provides relevant interfaces for querying and counting slow SQL operations.

Page display: the operation page can automatically start and stop database monitoring and set the time threshold of slow SQL. The setting of start-stop and time threshold depends on MA sending instructions to MOF. SQL statistics, tracking and other information displayed on the page are obtained through HM’s interface.

IV. Function Display

Currently, the functions of database monitoring include SQL classification statistics, database connection pool monitoring, slow SQL time-consuming distribution statistics, slow SQL statistics, slow SQL tracking and call chain/log correlation functions.

SQL classification statistics:

Data source: OpenTSDB (through portrait collection index)

Classification statistics for insert, delete, update, query and batch operations

According to the time distribution, the access situation of the database is displayed; according to the time distribution, the access situation of the database is displayed, and the total access count (accumulated value) of the selected time period is displayed.

You can customize the time condition to query historical data.

图片描述

Database connection pool monitoring:

Data source: OpenTSDB (through portrait collection index)

, can view the connection pool total connections, active connections, idle connections change curve
图片描述

Slow SQL Time-consuming Distribution Statistics:

Data source: ES

Slow SQL statistics can be displayed according to classification

For the time-consuming distribution statistics of slow SQL, the maximum number of queries is 100.

According to the time distribution, the slow SQL access situation of the database is displayed, and the slow SQL access time, SQL and time consumption at the current time point are displayed.

You can query historical data according to settings.

图片描述

Slow SQL statistics:

Data source: ES

For all types of SQL

According to the time distribution, slow SQL statistics of a certain period of time in the database are displayed.

You can query historical data according to settings.

图片描述

Slow SQL trace:

Data source: ES

The query criteria are: keyword, slow SQL tracking, and time range

Query the SQL trace list according to the search criteria, and the list shows the following contents: SQL statement, total execution times, total execution time, average execution time, operation-details can be viewed.

图片描述
 

Slow SQL Trace-Details View:

Data source: ES

Slow SQL details: click on a slow SQL statistic to view details: including start execution time, execution duration, participation, execution result, and number of affected items

图片描述

Slow SQL Trace-Call Chain Association:

Light call chain shall be opened in application monitoring.

Click on the detailed execution time of a line to jump to the call chain page to view the details of the call chain (the relevant call chain is highlighted)

图片描述

Slow SQL Trace-Log Association:

In application monitoring, log collection needs to be started.

, click on a line of detailed call chain content log association, can view the corresponding log information, the relevant log lines highlighted

图片描述

V. summary

Database monitoring cannot be ignored. Good database monitoring can help optimize the system and carry out real-time early warning. Through the database connection pool monitoring introduced in this paper, operation and maintenance personnel can pay attention to the status of database connection pool at any time, effectively preventing the system from being unable to connect to the database when the number of connection pool active connections is full. The slow SQL monitoring function can dynamically display the SQL situation of a system, help optimize SQL statements and make the system more stable.