Open source | why use ns4_gear_idgen ID generator?

  Open source software

Introduction:: Yixin officially opened the nextsystem4 (hereinafter referred to as “NS4”) series module on March 29, 2019. This open source NS4 series module is a distributed business system solution generated around the current payment system, which is cumbersome, highly code coupled and costly to maintain. NS4 series framework allows the creation of complex processes/business flows. The implementation of business service nodes can be connected in series and distributed. Its simplicity and light weight make it possible to “decapsulate” (independent of tomcat, jetty and other containers). The design concept of NS4 series framework is to separate business and logic. Developers can realize a business system with complex logic, high performance and stable function by simple configuration and business implementation.Click to view the overall introduction of the framework.

NS4 series includes 4 open source modules, namely: ns4_frame Distributed Service Framework(For details, click to view: Open Source |ns4_frame Distributed Service Framework Development Guide), ns4_gear_idgen ID generator component (NS4 framework Demo example), ns4_gear_watchdog monitoring system component (service guard, application performance monitoring, data acquisition, automatic alarm system) and ns4_chatbot communication component.

Among them, ns4_gear_IDgen(ID generator) is implemented based on ns4_frame framework. it supports distributed deployment and generates globally unique id, in which length, prefix, suffix, step size and binary can also be freely configured according to their own business. NS4.0 framework can also be tested through ns4_gear_idgen.This article focuses on the advantages of ns4_gear_idgen (ID generator) scheme.

Project open source address:https://github.com/newsettle/ …

I introduction

In complex systems, it is often necessary to use a meaningful and orderly serial number as a globally unique ID to identify a large amount of data (such as order accounts).

II. Brief Introduction of Industry Schemes

2.1 timestamp scheme

Take the current millisecond/microsecond as ID, such as System.currentTimeMillis ()

Advantages

  • The ID generated locally does not need to be called remotely, and the delay is low.
  • The generated ID trend is increasing.
  • The generated ID is an integer, and the query efficiency is high after the index is established.

Disadvantages

  • If the concurrency is too high, a duplicate ID will be generated.
  • Can not be highly available, there is a single point of failure.
  • Not flexible enough to achieve ID isolation for different services.

2.2 UUID scheme

The standard form of UUID (Universal Unique Identifier) consists of 32 hexadecimal numbers, divided into five segments by hyphens, and 36 characters in the form of 8-4-4-4-12. Example: 550e8400-e29b-41d4-a716-446655440000. up to now, there are five ways to generate UUIDs in the industry. for details, please refer to the UUID specification a universal unique identifier (uuid) urnnamespace issued by IETF.

Advantages

  • Very high performance: locally generated, no network consumption.

Disadvantages

  • Not easy to store: UUID is too long, 16 bytes and 128 bits, usually represented by a 36-length string, and many scenarios are not applicable.
  • Information Insecurity: The algorithm of generating UUID based on MAC address may cause MAC address disclosure, which was used to find the location of the maker of Melissa virus.
  • Disordered ID: There is no order before ID.
  • When ID is used as primary key, there will be some problems in specific environment. For example, UUID is not applicable in the scenario of DB primary key:

    • A: MySQL officials have a clear suggestion that the primary key should be as short as possible [4]. UUID with 36 characters does not meet the requirements.

    All indexes other than the clustered index are known as secondary indexes. In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index. If the pr im ary key is long, the secondary indexes use more space, so it is advantageous to have a short primary key.

    • B: not good for MySQL index. As the primary key of the database, under the InnoDB engine, the disorder of UUID may cause frequent changes in the data location, seriously affecting the performance.

2.3 Snowflake scheme

Generally speaking, this scheme is an algorithm to generate ID by dividing namespace (UUID also counts, because it is relatively common, so it is analyzed separately). This scheme divides 64-bit into multiple segments to mark machine, time, etc. separately, for example, 64-bit in snowflake indicates respectively. As shown in the following figure (picture from the network):

41-bit time can represent (1L<<41)/(1000L*3600*24*365)=69 years, and 10-bit machines can represent 1024 machines respectively. If we have demand for IDC division, we can also divide 10-bit into 5-bit for IDC and 5-bit for working machines. In this way, 32 IDC can be represented, each IDC can have 32 machines, which can be defined according to their own needs.

The 12 self-increasing serial numbers can represent 2 12 ids. theoretically, the QPS of snowflake scheme is about 409.6w/s, which can ensure that the ids generated by any machine in any IDC are different in any millisecond.

Advantages

  • The number of milliseconds is high, the self-increasing sequence is low, and the whole ID is increasing.
  • It does not rely on third-party systems such as databases, and is deployed in a service mode. It has higher stability and very high ID generation performance.
  • Bit bits can be allocated according to their own service characteristics, which is very flexible.

Disadvantages

  • Strong dependence on the machine clock, if the clock on the machine goes back, it will result in duplicate numbers or unavailability of services.
  • Not flexible enough to achieve ID isolation for different services.

2.4 database auto_increment scheme

Take MySQL as an example. Set auto_increment_increment and auto_increment_offset to the fields to ensure ID self-increment. Each business uses the following SQL to read and write MySQL to obtain ID number.

begin;
REPLACE INTO Tickets64 (stub) VALUES ('a');
SELECT LAST_INSERT_ID();
commit;

Advantages

  • Very simple, using the functions of the existing database system to achieve, low cost, DBA professional maintenance.
  • The ID number increases monotonously and can realize some services with special requirements for ID.

Disadvantages

  • Strong dependence on DB makes the whole system unavailable when DB is abnormal, which is a fatal problem. Configure master-slave replication as much as possible.
  • Increased availability, but data consistency is difficult to guarantee under special circumstances. Inconsistency in master-slave switching may lead to repeated signaling.
  • The bottleneck of ID signaling performance is limited to the read-write performance of a single MySQL.

2.5 redis generation ID

All command operations of Redis are single-threaded, and they provide self-increasing atomic commands such as incr and increby, thus ensuring that the generated ID is definitely unique and orderly.

Considering the performance bottleneck of single node, Redis cluster can be used to obtain higher throughput. If there are 5 Redis in a cluster. You can initialize each Redis with a value of 1, 2, 3, 4, 5, and then the step size is 5. ID generated by each Redis is:

A:1, 6, 11, 16, 21

B:2, 7, 12, 17, 22

C:3, 8, 13, 18, 23

D:4, 9, 14, 19, 24

E:5, 10, 15, 20, 25

Advantages

  • It does not depend on the database, is flexible and convenient, and has better performance than the database.
  • Numeric ID is naturally sorted, which is helpful for paging or sorting results.
  • Using clusters can prevent single point of failure.

Disadvantages

  • If there is no Redis in the system, new components need to be introduced to increase the complexity of the system.
  • The workload required for encoding and configuration is relatively large.
  • Step size and initial value should be determined in advance and are not easy to expand.

2.6 ns4_gear_idgen scheme

First look at the database table design:

Field description:

  • Id: database primary key, no actual meaning.
  • Key_name: used to distinguish business. Different businesses use different ones.
  • Key_name, ID of each key_name is isolated from each other and does not affect each other. If there are future performance requirements that require expansion o f the database, the complex expansion operations described above are not required, and only the biz_tag is divided into databases and tables.
  • Key_value: indicates the maximum value of the ID number segment to which the key_name is currently assigned.
  • Key_length: length of the generated ID.
  • Key_cache: Indicates the length of the number segment allocated each time. Originally, it was necessary to write the database every time to obtain the ID, but now it is only necessary to set the key_cache large enough, such as 1000. Then the database will be read and written again only after 1,000 numbers have been consumed. The frequency of reading and writing the database has been reduced from 1 to 1/step.
  • Key_prefix: generate prefix of ID, configurable custom prefix+date part such as: ${date14}/TEST${date14}

    • The ID prefix date part supports the following date formats:

  • Key_suffix=: Generate suffix of ID, with or without configuration
  • Key_digit:ID number, which supports decimal 36 and 62.
  • Version: the version number corresponding to each record, and the user updates the record.

Advantages

  • Very convenient linear expansion, can support most business scenarios.
  • There are various rules for generating ID, which can be configured and support decimal, 36-decimal and 62-decimal.
  • ID’s between services are isolated from each other and do not affect each other.
  • Obtaining ID does not require frequent database operation, and the database will only be operated when the ID in the number section is almost consumed, thus reducing the pressure on the database.
  • Initialize the ID in the number section in advance to ensure that the initialization is completed before the ID in each number section is used up, so as to avoid the impact of initialization after the service uses the ID.
  • The size of key_value can be customized, which is very convenient for business migration from the original ID mode.
  • High disaster tolerance: the service is internally cached with number segments. Even if DB goes down, the service can still provide service to the outside in a short time.

Iii. introduction of functions

The ID generator is implemented based on NS4 framework and supports distributed deployment. At the same time, the generated ID length, prefix, suffix and step size can be configured freely according to their own business.

Its functions can be divided into the following parts:

  • Gets an ID of a single Long type, such as 66310
  • Get id of batch String type: 1901112321266312, 1901112321266313, 1901112321266314, 1901112321266315

IV. Request Mode

V. SQL scripts

See ns4_gear_idgen source code under gear_key.sql

Source of content:Yixin Institute of Technology