Rules Analysis of Themis

  Database, mysql

Introduction:

Themis is a database auditing product developed by DBA team of Yixin Company, which can help DBA and developers to quickly discover database quality problems and improve work efficiency.

This platform can audit Oracle and MySQL databases in multiple dimensions (object structure, SQL text, execution plan and execution characteristics) to evaluate the design quality of object structure and SQL operation efficiency. It can help DBA and developers to quickly discover positioning problems. And provide part of the auxiliary diagnosis ability, improve the optimization work efficiency. All operations can be carried out through the WEB interface, which is simple and convenient. In addition, in order to better meet personalized needs, the platform also provides expansion capabilities, and users can expand according to their needs.

Open source address:https://github.com/CreditEaseDBA

Click to viewThemis deployment strategy

I. Rule Analysis

Rule analysis is divided into four parts: object class rule analysis, text class rule analysis, execution plan class rule analysis and statistical information class rule analysis. Each module can be performed manually or automatically.

1.1 Analysis of Object Class Rules

Manually Resolve oracle Object Class Information

Json file

{
    "module": "analysis",
    "type": "OBJ",
    "db_server": "127.0.0.1",
    "db_port": 1521,
    "username": "schema",
    "db_type": "O",
    "rule_type": "OBJ",
    "rule_status": "ON",
    "create_user": "system",
    "task_ip": "127.0.0.1",
    "task_port": 1521
}

Configure db_server, db_port, username, create_user, task_ip options. Leave the others as default. username is the name of the target object to be audited.

python command.py -m analysis_o_obj -c data/analysis_o_obj.json

Use the above command to start collecting obj data

Manually parse mysql object class data

Json file

{
    "module": "mysql",
    "type": "OBJ",
    "db_server": "127.0.0.1",
    "db_port": 3306,
    "username": "schema",
    "db_type": "mysql",
    "rule_type": "OBJ",
    "rule_status": "ON",
    "create_user": "mysqluser",
    "task_ip": "127.0.0.1",
    "task_port": 3306
}

Configure db_server, db_port, username, create_user, task_ip, db_port options, and leave the others as default.

Run command:

python command.py -m analysis_m_obj -c data/analysis_m_obj.json

Oracle and mysql object class rules do not need to depend on the collected data. They are directly connected to the database for query. As some libraries are large and may take a long time, it is recommended to carry out them in the low peak period of business.

1.2 Text Class Rule Analysis

Resolve oracle Text Class Rules Manually

Json file

{
    "module": "analysis",
    "type": "TEXT",
    "username": "schema",
    "create_user": "SYSTEM",
    "db_type": "O",
    "sid": "cedb",
    "rule_type": "TEXT",
    "rule_status": "ON",
    "hostname": "127.0.0.1",
    "task_ip": "127.0.0.1",
    "task_port": 1521,
    "startdate": "2017-02-23",
    "stopdate": "2017-02-23"
}

Configure sid, username, create_user, task_ip, hostname, startdate, stopdate options. Since data is collected on a daily basis, only startdate and stopdate are supported to be consistent for the time being, hostname and task_ip can be consistent, and the others can remain the default.

Rule parsing can be performed by executing the following command:

python command.py -m analysis_o_plan -c data/analysis_o_plan.json

Manually parse mysql text class rules

Json file

    "module": "analysis",
    "type": "TEXT",
    "hostname_max": "127.0.0.1:3306",
    "username": "schema",
    "create_user": "mysqluser",
    "db_type": "mysql",
    "rule_type": "TEXT",
    "rule_status": "ON",
    "task_ip": "127.0.0.1",
    "task_port": 3306,
    "startdate": "2017-02-21 00:00:00",
    "stopdate": "2017-02-22 23:59:00"
}

Configure the username, create_user, taskip, taskport, hostname, hostname_max, startdate, stopdate options. hostname and taskip can remain the same, while others can remain the default.

Rule resolution can be performed by running the following command:

python command.py -m analysis_m_text -c data/analysis_m_text.json

Username in the above two steps is the object to be audited.

1.3 Analysis of Implementation Plan Class Rules

Oracle plan type rule resolution

Json file

 {
        "module": "analysis",
        "type": "SQLPLAN",
        "capture_date": "2017-02-23",
        "username": "schema",
        "create_user": "SYSTEM",
        "sid": "cedb",
        "db_type": "O",
        "rule_type": "SQLPLAN",
        "rule_status": "ON",
        "task_ip": "127.0.0.1",
        "task_port": 1521
 }

Mainly configure capture _ date, username, create _ user, sid, db _ type, rule_type, task _ IP, task _ port parameters. type is divided into four types: SQLPLAN,SQLSTAT,TEXT,OBJ. the type of rule _ type is the same as SQLPLAN. Only one represents the type of module and the other represents the type of rule. db_type is divided into “O” and “mysql”, representing oracle and mysql respectively. capture_date is the capture date of the data that we owe a flat configuration.

python command.py -m analysis -c data/analysis_o_plan.json

Run the above command to generate the analysis result.

Mysql plan Rule Analysis

Json file

  {
        "module": "analysis",
        "type": "SQLPLAN",
        "hostname_max": "127.0.0.1:3306",
        "db_server": "127.0.0.1",
        "db_port": 3306,
        "username": "schema",
        "db_type": "mysql",
        "rule_status": "ON",
        "create_user": "mysqluser",
        "task_ip": "127.0.0.1",
        "rule_type": "SQLPLAN",
        "task_port": 3306,
        "startdate": "2017-02-21 00:00:00",
        "stopdate": "2017-02-22 23:59:00"
    }

Type has the same meaning as oracle above. hostname_max is mysql’s ip: port number. each hostname_max represents a mysql instance. startdate and stopdate need to add hours, minutes and seconds, which is different from oracle.

python command.py -m analysis -c data/analysis_m_plan.json

Then run the above command to parse mysql’s plan rules.

1.4 Perform Feature Class Rule Analysis

Oracle stat type rule resolution

Json file

{
    "module": "analysis",
    "type": "SQLSTAT",
    "capture_date": "2017-02-23",
    "username": "schema",
    "create_user": "SYSTEM",
    "sid": "cedb",
    "db_type": "O",
    "rule_type": "SQLSTAT",
    "rule_status": "ON",
    "task_ip": "127.0.0.1",
    "task_port": 1521
}

Configure sid, username, create_user, task_ip, capture_date options, and leave others as default.

Run command:

python command.py -m analysis_o_stat -c data/analysis_o_stat.json

Carry out data collection.

Mysql stat Type Rule Resolution

Json

{
    "module": "analysis",
    "type": "SQLSTAT",
    "hostname_max": "127.0.0.1:3306",
    "db_server": "127.0.0.1",
    "db_port": 3306,
    "username": "schema",
    "db_type": "mysql",
    "rule_status": "ON",
    "create_user": "mysqluser",
    "task_ip": "127.0.0.1",
    "rule_type": "SQLSTAT",
    "task_port": 3306,
    "startdate": "2017-02-21 00:00:00",
    "stopdate": "2017-02-22 23:59:00"
}

Configure username, create_user, task_ip, task_port, hostname, hostname_max, startdate, stopdate options. hostname and task_ip can remain the same, while others can remain the default.

Run command:

python command.py -m analysis_m_text -c data/analysis_m_text.json

Carry out data collection.

1.5 Automatic Rule Analysis

The manual rule parsing described above can be tested or used under some special circumstances. In most cases, we will use automatic rule parsing.

Automatic rule parsing is done by using Celly. For the use of Celly, please refer tohttp://docs.celeryproject.org ….

Here are some common commands about celery:

开启规则解析
celery -A task_other worker -E -Q sqlreview_analysis -l info
开启任务导出
celery -A task_exports worker -E -l info
开启obj信息抓取
celery -A task_capture worker -E -Q sqlreview_obj -l debug -B -n celery-capture-obj
开启flower
celery flower --address=0.0.0.0 --broker=redis://:password@127.0.0.1:6379/
开启plan、stat、text抓取
celery -A task_capture worker -E -Q sqlreview_other -l info -B -n celery-capture-other

Finally, we will add rule analysis to supervisor management, then generate tasks through the web interface, then use celery to schedule, and view the execution status of tasks through flower.

Please refer to the supervisor’s configuration for specific usage.

II. Description of Built-in Rules

The core of the platform is the rules. Rules are the definition and implementation of a set of filtering conditions. The richness of the rule set represents the capability of the platform. The platform also provides extensibility, allowing users to define their own rules. Judging from the classification, the rules can be roughly divided into several categories.

2.1 Rule Classification

  • According to the database type, rules can be divided into Oracle and MySQL. Not all rules distinguish between databases, and rules for text classes do not.
  • According to complexity, rules can be divided into simple rules and complex rules. The simplicity and complexity here actually refer to the implementation of rule auditing. Simple rules are a set of query statements that can be described as mongodb or relational databases. However, complex rules need to be implemented externally through the program body.
  • From the perspective of auditing objects, rules can be divided into object classes, text classes, execution plan classes and execution feature classes.

2.2 Rule Parameters

Rules can contain parameters. For example, one of the execution plan rules is a large table scan. Here, the definition of large tables needs to be defined by parameters, which can be specified by physical size.

2.3 Rule Weight and Threshold

  • The weight, representing violation of rules, is deducted several points at a time. It can be adjusted according to its own situation.
  • Threshold value, which represents the upper limit of deduction for violation of rules. The main purpose here is to avoid violating too many single rules and ignoring other rules.

Rule weights and deduction points will eventually accumulate into a total deduction point, and the platform will convert according to the percentile system. In this way, can play a quantitative role.

2.4 Rule _ Object Class (Oracle Section)

2.5 Rule _ Object Class (MySQL Section)

2.6 Rule _ Execution Plan Class (Oracle Section)

2.7 Rules _ Execution Plan Class (MySQL Section)

2.8 Rule _ Execution Feature Class (Oracle Section)

2.9 Rule _ Execution Feature Class (MySQL Section)

2.10 Rule _ Text Class

III. Joining in Development

There are problems can be directly inhttps://github.com/CreditEase …Put forward.

This article is selected from:wikihttps://tuteng.gitbooks.io/th ….
Due to the length, please click on the link to view the original text.

Source:Yixin Institute of Technology