Php and unique ID generation related issues

  Only, php, Uniqueness

[Original Address:https://blog.ti-node.com/blog …]

The generation of the unique ID is not a trivial matter. It is not an easy thing to say love it, nor is it an easy thing like a uniqid ().

Why do you want a unique ID?

1. The self-increasing id of the database will be a disaster when it is divided into two databases. Suppose that two databases are divided, because each database will start to increase from 1. At that time, two users with ID 1 will appear in the system.

2. Self-increasing id will expose the number of users or other traffic.

3. Self-increasing id will enable the interested party to obtain any user’s information through API.

What about the solution?

1. UUID, full name Universal Unique Identifier, Chinese Universal Unique Identifier. This is a standard proposed by the Open Software Foundation to solve the problem of generating unique identifiers in a distributed environment. UUID has a fixed length of 32 bits and an organizational format of 8-4-4-4-12. Of course, when in use, The middle delimiter is to be removed. There are several problems to be mentioned about this product. First, it is a mixture of letters and numbers. In some traditional databases, indexes are not very easy to do. Not only is the index bulky, but also the query efficiency is poor. Second, it is also very large.

2. Mongodb ObjectId, whose format is very similar to UUID, is a built-in data type of Mongodb. If you do not specify _id when inserting data, Mongodb will use this product to fill _id by default, which has good query efficiency in such kv databases as MongoDB.

3. Self-built solutions. In order to be able to solve business problems, many companies have put forward some solutions themselves. Undoubtedly, these solutions must achieve the following points:

  • Ensure uniqueness of global space
  • Make sure to adopt the number type instead of the combination of numbers and letters as much as possible.
  • Ensure a certain sequence of rows and meanings
  • To ensure a certain degree of solvability, the relevant information of the ID can be known through the result of the inverse solution.

There are several solutions on the market, including Twitter’s snowflake, Flikr’s database self-augmentation scheme and Instagram’s database stored procedure scheme. this article focuses on the principle of Twitter’s snowflap solution.

Snowflask uses 64bit to represent an ID. Twitter engineers divided it into four segments, each representing a different meaning. The following figure:

The first 41bit segment is a time stamp, and the unit will be accurate to millisecond level. Therefore, the time capacity that this bit segment can hold is 2^41 = 2.199×10^12 milliseconds, or 2.199×10^9 seconds. In terms of adulthood, it is about 69.7 years. In other words, starting from January 1, 1970, it can be used for 69.7 years, and the upper limit of id generation machine is reached.

The middle 10bit segment is the machine id, which can accommodate up to 2 10 = 1024 machines. You can deploy more than one machine and assign each machine a unique and independent id number. For example, starting from the 1st, you can deploy up to 1024 machines. Then a server such as nginx can act as an agent at the front of the machine cluster, and a good ID number distribution cluster can be completed.

The latter 12-bit segment is a self-increasing sequence (you can be equivalent to mysql’s self-increasing id). It represents the self-increasing sequence within 1 millisecond. That is to say, the self-increasing sequence starts from 0 millisecond until the end of 1 millisecond. That is to say, the maximum number of sequences generated within 1 millisecond is 2 12 = 4096, that is to say, the maximum number of sequences generated within the same millisecond on the same machine is 4096. If the number is exceeded, then wait for the next millisecond to generate a new id.

TIPS: Incidentally, php’s uniqid () function has great risks. The id generated by it cannot be uniqid like its name, with a slightly higher repetition rate. Please check the manual page comments of this function for details.click here

TIPS: recommend a php id generator based on snowflake:Donkeyid generator