Compared with HTTP/1, HTTP/2 can be said to greatly improve the performance of web pages. Only upgrading to this protocol can reduce a lot of performance optimization work that needs to be done before. Of course, compatibility problems and graceful degradation should be one of the reasons why it is not widely used in China.
Although HTTP/2 improves the performance of web pages, it does not mean that it is already perfect. HTTP/3 was introduced to solve some problems existing in HTTP/2.
If you want to read more excellent articles, please stamp them fiercely.GitHub blog
I. HTTP protocol
HTTP protocol is the abbreviation of HyperText Transfer Protocol, which is the most widely used network protocol on the internet. All WWW documents must comply with this standard. With the birth of computer network and browser, HTTP1.0 also follows. It is located in the application layer of computer network. HTTP is based on TCP protocol, soThe bottleneck of HTTP protocol and its optimization techniques are all based on the characteristics of TCP protocol itself.For example, tcp establishes 3 handshakes and disconnects 4 handshakes and RTT delay time caused by each connection establishment.
II. Defects in HTTP/1.x
Connections cannot be reused: The connection cannot be reused, resulting in each request going through three-way handshake and slow startup. The impact of three-way handshake is obvious in the high-latency scenario, while slow startup has a greater impact on a large number of small file requests (requests that do not reach the maximum window are terminated).
- When HTTP/1.0 transmits data, the connection needs to be reestablished every time, increasing the delay.
- HTTP/1.1 although joining keep-alive can reuse some connections, it still needs to establish multiple connections in case of domain name fragmentation, which consumes resources and brings performance pressure to the server.
Head-Of-Line Blocking（HOLB）: Causes bandwidth to be underutilized and subsequent health requests to be blocked.HOLBRefers to a series of packets because the first packet is blocked; When many resources need to be requested in the page, HOLB (queue head blocking) will cause the remaining resources to wait for other resource requests to be completed before initiating the request when the maximum number of requests is reached.
- HTTP 1.0: The next request cannot be issued until the previous request returns.
request-responseYes, in sequence. Obviously, if a request does not return for a long time, then all subsequent requests are blocked.
- HTTP 1.1: try to use pipeling to solve this problem, that is, the browser can issue multiple requests (same domain name, same TCP link) at one time. However, pipeling requires the return to be in sequence. if the previous request is time consuming (e.g. processing large pictures), the subsequent request will wait for the previous request to be processed before returning in sequence even if the server has already processed it. Therefore, pipeling only partially solved HOLB.
- HTTP 1.0: The next request cannot be issued until the previous request returns.
As shown in the above figure, requests circled in red have been suspended for a period of time because the number of domain name links has exceeded the limit.
- Protocol overhead is high: HTTP1.x when in use, the header carries too much content, which increases the transmission cost to a certain extent, and the header does not change much in each request, especially in the mobile terminal to increase user traffic.
- factor of safety: HTTP1.x when transmitting data, all content transmitted is clear text, and the client and server cannot verify each other’s identity, which cannot guarantee the data security to a certain extent
III. SPDY Agreement
Because of the problem of HTTP/1.x, we will introduce sprite maps, inline small maps, use multiple domain names, etc. to improve performance. However, these optimizations bypassed the protocol until 2009, when Google unveiled its own SPDY protocol, which mainly solved the problem of inefficient HTTP/1.1. Google’s launch of SPDY is a formal revamp of the HTTP protocol itself. Reduce delay, compress header, etc. SPDY’s practice has proved the effect of these optimizations and finally brought the birth of HTTP/2.
After SPDY protocol proved feasible on Chrome browser, it was taken as the basis of HTTP/2, and its main features were inherited in HTTP/2.
Iv. introduction to HTTP/2
In 2015, HTTP/2 was released. HTTP/2 is an alternative to the current HTTP protocol (HTTP/1.x), but it is not rewritten. http methods/status codes/semantics are the same as http/1.x. HTTP/2 is based on SPDY3 and focuses onPerformanceOne of the biggest goals is to use only one connection between the user and the website.
HTTP/2 consists of two Specification:
- Hypertext Transfer Protocol version 2 – RFC7540
- HPACK – Header Compression for HTTP/2 – RFC7541
V. new features of HTTP/2
1. Binary transmission
HTTP/2 uses the binary format to transmit data instead of the text format of HTTP 1.x, and the binary protocol is more efficient to parse. HTTP/1 request and response messages are composed of a starting line, a header and an entity body (optional), and each part is separated by a text line break.HTTP/2 divides the request and response data into smaller frames, and they adopt binary encoding.
Next we will introduce several important concepts:
- Flow: A flow is a virtual channel in a connection that can carry two-way messages; Each stream has a unique integer identifier (1, 2 … n);
- Message: A logical HTTP message, such as a request, response, etc., consisting of one or more frames.
- Frame: The smallest unit of HTTP 2.0 communication. Each frame contains a frame header and at least identifies the stream to which the current frame be longs, carrying specific types of data, such as HTTP header, payload, etc.
In HTTP/2, all communications under the same domain name are completed on a single connection, which can carry any number of two-way data streams. Each data stream is sent in the form of a message, which in turn consists of one or more frames. Multiple frames can be sent out of order and reassembled according to the stream identification of the frame header.
Multiplexing technology is introduced in HTTP/2. Multiplexing well solves the problem that browsers limit the number of requests under the same domain name, and at the same time it is easier to realize full speed transmission. After all, a new TCP connection needs to slowly increase the transmission speed.
You can passThis linkIntuitively feel how much faster HTTP/2 is than HTTP/1.
In HTTP/2, after binary framing, HTTP /2 no longer relies on TCP links to realize multi-stream parallelization. In HTTP/2:
- All communications under the same domain name are completed on a single connection.
- A single connection can carry any number of two-way data streams.
- The data stream is sent in the form of a message, which in turn consists of one or more frames. Multiple frames can be sent out of order because they can be reassembled according to the stream identification of the frame header.
This feature greatly improves performance:
- The same domain name only needs to occupy one TCP connection, and uses one connection to send multiple requests and responses in parallel, thus eliminating the delay and memory consumption caused by multiple TCP connections.
- Multiple requests are sent in parallel and staggered without affecting each other.
- Multiple responses are sent in parallel and staggered, and the responses do not interfere with each other.
- In HTTP/2, each request can take a 31-bit priority value, 0 indicates the highest priority, and the higher the value, the lower the priority. With this priority value, clients and servers can adopt different strategies when processing different streams and send streams, messages and frames in an optimal way.
As shown in the above figure, the multiplexing technology can transmit all the requested data through only one TCP connection.
In HTTP/1, we use the form of text to transmit the header. When the header carries a cookie, it may need to repeatedly transmit hundreds to thousands of bytes each time.
In order to reduce this resource consumption and improve performance, HTTP/2 adopted a compression strategy for these headers:
- HTTP/2 uses a “header table” at the client and server to track and store previously sent key-value pairs. For the same data, it is no longer sent through each request and response.
- The header table always exists during the connection duration of HTTP/2 and is updated gradually by both the client and the server.
- Each new header key-value pair is either appended to the end of the current table or replaces the previous value in the table.
For example, in the two requests in the following figure, the first request sends all header fields, while the second request only needs to send difference data, thus reducing redundant data and overhead
Server push means that the server can push the contents required by the client in advance through a push method, which is also called “cache push”.
Can imagine the following situation, some resource clients will definitely request, at this time can take the server push technology, push the necessary resources to the client in advance, so that it can relatively reduce a little delay time. Of course, you can also use prefetch if your browser is compatible.
For example, the server can actively push JS and CSS files to the client without sending these requests when the client parses HTML.
The server can actively push, and the client also has the right to choose whether to receive. If the resources pushed by the server have been cached by the browser, the browser can reject them by sending RST_STREAM frames. Active push also follows the same origin policy, in other words, the server cannot push the third-party resources to the client casually, but must be confirmed by both parties.
VI. New Features of HTTP/3
Introduction to 1.HTTP/3
Although HTTP/2 has solved many problems of previous versions, it still has a huge problem, which is mainly caused by the underlying TCP protocol.
As mentioned above, HTTP/2 uses multiplexing. Generally speaking, only one TCP connection is required under the same domain name. However, when packet loss occurs in this connection, HTTP/2 will not perform as well as HTTP/1.
Because in the case of packet loss, the entire TCP has to start waiting for retransmission, which causes all the data behind to be blocked. However, for HTTP/1.1, multiple TCP connections can be opened, which will only affect one connection, and the remaining TCP connections can also transmit data normally.
Then maybe someone will consider modifying TCP protocol, in fact this is already an impossible task. Because TCP has existed for too long, it has been flooded in all kinds of devices, and this protocol is implemented by the operating system, so it is not realistic to update it.
For this reason,Google has even started a QUIC protocol based on UDP and used it on HTTP/3.Before HTTP/3, it was called HTTP-over-QUIC. from this name, we can also find that the biggest transformation of HTTP/3 is the use of QUIC.
Although QUIC is based on UDP, many new functions have been added to the original. Next, we will focus on several new functions of QUIC.
2.QUIC New Function
By using a technology similar to TCP fast opening, the context of the current session is cached, and when the session is resumed next time, only the previous cache needs to be passed to the server for verification and transmission.0RTT connection can be said to be QUIC’s biggest performance advantage over HTTP2. What is 0RTT connection?
There are two meanings:
1. The transport layer can establish a connection in 0RTT.
2. Encryption layer 0RTT can establish encrypted connection.
On the left of the above figure is the connection process of a full handshake of HTTPS, which requires 3 RTT. Even session reuse requires at least 2 RTT.
And QUIC? Because it is based on UDP and realizes the secure handshake of 0RTT at the same time, in most cases, only 0RTT is needed to realize data transmission. On the basis of forward encryption, the success rate of 0 RTT is much higher than that of TLS session record list.
Although HTTP/2 supports multiplexing, TCP protocol does not have this function after all. QUIC has realized this function in the first place, and the single data stream transmitted can ensure orderly delivery and will not affect other data streams. this technology has solved the problems existing in TCP before.
Like HTTP2.0, multiple Streams can be created on the same QUIC connection to send multiple HTTP requests. However, QUIC is based on UDP and there is no dependency between multiple Streams on a connection. For example, stream2 lost a UDP packet in the following figure, which will not affect Stream3 and Stream4, and there is no TCP header block. Although the package of stream2 needs to be retransmitted, the packages of stream3 and stream4 can be sent to users without waiting.
In addition, QUIC will perform better on the mobile side than TCP. Because TCP recognizes connections based on IP and ports, this method is very fragile in the changeable mobile network environment. However, QUIC identifies a connection by ID. No matter how your network environment changes, as long as the ID remains the same, you can reconnect quickly.
- Encrypted and authenticated message
TCP protocol header is not encrypted and authenticated, so it is easy to be tampered with, injected and eavesdropped by intermediate network devices during transmission. Such as modifying serial numbers and sliding windows. These behaviors may be due to performance optimization or active attacks.
But QUIC’s packet can be said to be armed to the teeth. Except for some messages such as PUBLIC_RESET and CHLO, all message headers are authenticated and message Body is encrypted.
In this way, as long as any modification is made to the QUIC message, the receiving end can discover it in time, effectively reducing the security risk.
As shown in the above figure, the red part is the header of the Stream Frame and has authentication. The green part is the message content, all encrypted.
- Forward error correction mechanism
QUIC protocol has a very unique feature, called Forward Error Correction (FEC). Each packet includes some data of other packets in addition to its own content, so a small amount of packet losses can be directly assembled through redundant data of other packets without retransmission. Forward error correction sacrifices the upper limit of data that can be sent per data packet, but reduces data retransmission due to packet loss, because data retransmission will consume more time (including the time consumption of steps such as confirming packet loss, requesting retransmission, waiting for new data packets, etc.)
If I want to send three packets this time, the protocol will calculate the XOR value of the three packets and send out a check packet separately, that is, four packets have been sent out in total. When the non-check packet is lost, the content of the lost packet can be calculated from the other three packets.Of course, this technology can only be used when one packet is lost. If multiple packets are lost, the error correction mechanism cannot be used, and only retransmission can be used..
- HTTP/1.x has many defects, such as the connection cannot be reused, the queue head is blocked, the protocol overhead is large, and the security factor and so on.
- HTTP/2 greatly improves performance through multiplexing, binary stream, Header compression and other technologies, but there are still problems
- QUIC is implemented based on UDP, which is the underlying supporting protocol in HTTP/3. This protocol is based on UDP and takes the essence of TCP, realizing a fast and reliable protocol.
To recommend a useful BUG monitoring toolFundebug, welcome to try free!
Welcome to pay attention to the public number:Front end craftsmanWe witness your growth together! If you feel fruitful, please give me a reward to encourage me to output more high-quality open source content.