Four Serialization Schemes for PHP

  msgpack, php, Serialization

Original address:https://t.ti-node.com/thread/ …

Serialization of data is a very useful function. However, many people, like me, don’t understand what this stuff is for when they first come into contact with this stuff. Anyway, the teacher said that, if they really can’t understand it, they will recite it first.

“What came ah, serialization, and deserialization. . 。” (The picture is from “My Colonel, My Regiment” by the National Army’s 2-way trafficker and 2-way actor fan Long schoolmate).

In fact, serializing data has two functions:

  • Convenient transmission
  • Convenient storage

How to understand convenient storage?For example, we have a PHP object or a PHP array that needs to be stored in a database or even a file, which is obviously impossible. At this time, we must serialize the PHP object or PHP array before executing the storage operation. However, it is understandable that PHP arrays can be stored after serialization. Can this object be stored? Is this operation too coquettish? Young man, this is not coquettish at all. In some cases, the objects are stored directly and can be put into production after simple deserialization, thus avoiding the performance cost caused by new at one time.

How to understand convenient transmission?In fact, serialization is relatively more and more common in transmission. The simplest example is that the code at the front of a code uses ajax to find you to provide TA with an API. At this time, you two have to discuss what data to return, such as json or xml, and even you two have to make a private data format by yourselves. For example, in a typical service architecture, data is transferred between the gateway server and the internal RPC server through msgpack. This is a typical application case of serialization for transmission.

The concept of serialization here may be more extensive and general, including traditional serialize, json, msgpack, protobuf, etc. (If you don’t think serialization is too strict, you can use encode instead; Deserialization is replaced by decode. Anyway, I will use serialization and deserialization to call it all. If you feel really uncomfortable, you can cut me down along the network cable! )。

In fact, from a higher level, data serialization can be divided into two types:

  • Text serialization, such as json, serialize, xml, etc
  • Binary serialization, such as msgpack, protobuf, thrift, etc.

Generally speaking, there are two performance indicators to test the serialization technology, one is the serialization speed and the other is the size of the serialized data. Naturally, the faster the serialization speed, the smaller the serialized data, the better. Currently, binary serialization such as protobuf and msgpack is better than text serialization in terms of speed and data size. But then again, text serialization is more readable, and you can see at a glance what the data content is.

Here are four specific plans brought here today. These four plans are simple and crude and ready to use out of the box. Let’s test our feelings and see which is more suitable for us.

The four friends attending the meeting: PHP’s built-in serialize, PHP’s built-in JSON parser, PHP extension JSOND, and PHP extension msgpack. The first three are of text type and msgpack is of binary type.

As an advanced version of PHP’s built-in JSON parser, JSOND has been rumored to be faster than the built-in JSON parser. As an extension, this product requires additional installation, with the attached address:https://pecl.php.net/get/json ….

Msgpack is a set of binary serialization tools developed by bird brothers and others. slogan is “It’s like JSON.but fast and small.” attached is the address:https://pecl.php.net/get/msgp …

  1. Serialize usage
    Serialize (), serialization method.
    Unserialize (), deserialize method.
  2. Json usage
    Json_encode (), is there nothing to say?
    Json_decode (), is there nothing to say?
  3. Jsond usage
    Jsond_encode (), like json_encode (), is followed by more than one letter D.
    Jsond_decode (), like json_decode (), is followed by more than one letter D.
  4. Msgpack usage
    Msgpack_pack (), serialization method.
    Msgpack_unpack (), deserialization method.

The test code is as follows:

<?php
// 故意搞了一个还算大的php数组,更容易看出差距来
$arr = array(
  array(
    'uid' => 22193123,
    'gender' => 'famale',
    'username' => 'elarity',
    'password' => md5('www123'),
    'relation' => array(
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
      array(
        'uid' => 22193123,
        'gender' => 'famale',
        'username' => 'elarity',
        'password' => md5('www123'),
      ),
    ),
  )
);

// 每种序列化方案都执行100000次
$counter = 100000;

// json序列化方案,执行100000次
echo PHP_EOL.PHP_EOL;
$start = microtime( true );
for( $i = 1; $i <= $counter; $i++ ){
  $json = json_encode( $arr ); 
}
$size = strlen( $json );
$end = microtime( true );
$cost_time = $end - $start;
echo "json_encode : 耗费时间为{$cost_time} , 数据体积为{$size}".PHP_EOL;

// jsond序列化方案,执行100000次
$start = microtime( true );
for( $i = 1; $i <= $counter; $i++ ){
  $jsond = jsond_encode( $arr ); 
}
$size = strlen( $jsond );
$end = microtime( true );
$cost_time = $end - $start;
echo "jsond_encode : 耗费时间为{$cost_time} , 数据体积为{$size}".PHP_EOL;

// serialize序列化方案,执行100000次
$start = microtime( true );
for( $i = 1; $i <= $counter; $i++ ){
  $serialize = serialize( $arr ); 
}
$size = strlen( $serialize );
$end = microtime( true );
$cost_time = $end - $start;
echo "serialize : 耗费时间为{$cost_time} , 数据体积为{$size}".PHP_EOL;

// msgpack序列化方案,执行100000次
$start = microtime( true );
for( $i = 1; $i <= $counter; $i++ ){
  $msgpack = msgpack_pack( $arr );
}
$size = strlen( $msgpack );
$end = microtime( true );
$cost_time = $end - $start;
echo "msgpack耗费时间为 : {$cost_time} , 数据体积为{$size}".PHP_EOL;
echo PHP_EOL.PHP_EOL;

Save the file as test.php and PHP test.php will execute it. The result is shown in the following figure:

To sum up:

  1. Jsond is really faster than json
  2. Json is faster than serialize () when some unruly person opens his mouth.
  3. The serialize () data volume is really large (because the data type description is still retained)
  4. Msgpack is the best? ? ? I don’t know, Ang, how do you feel

Supplement: There is a very good supplement in the commentary, pointing out some relatively one-sided contents here. You can add and watch:

“Can you look at my test:https://blog.yurunsoft.com/a/ …
I have chosen relatively many data types, including arrays with objects and small data with more data. In fact, it is meaningless to say which is better and faster. It is better to choose the right one according to the actual scene. “