Open the browser from entering the web address to presenting the web page in front of everyone, what happened behind it? What kind of process did you go through? First, I’d like to give you a general flow chart. Please see the breakdown below for specific steps.
The starting address of this article isGitHub blog, writing articles is not easy, please support and pay more attention!
Generally speaking, it is divided into the following processes:
- DNS Resolution: Resolve Domain Names to IP Addresses
- TCP connection: TCP three-way handshake
- Send HTTP request
- The server processes the request and returns an HTTP message
- Browser Parses Rendered Page
- Disconnect: TCP Wave Four Times
First, what is the URL
URL(Uniform Resource Locator), a uniform resource locator, is used to locate resources on the Internet, commonly known as web addresses.
such ashttp://www.w3school.com.cn/ht …, abide by the following grammar rules:
Each part is explained as follows:
Scheme-Defines the type of Internet service. Common protocols include http, https, ftp, file, of which the most common type is http, while https is encrypted network transmission.
Host-domain host (http’s default host is www)
Domain-Define InternetDomain name, such as w3school.com.cn
Port-Defines the port number on the host (the default port number for http is 80)
Path-Defines the path on the server (if omitted, the document must be in the root directory of the website).
Filename-Defines the name of the document/resource
Second, domain name resolution (DNS)
After the browser enters the web address, it must first go through domain name resolution, because the browser cannot find the corresponding server directly through the domain name, but through the IP address. You may have a question here-computers can be given either IP addresses or host names and domain names. such as
www.hackr.jp. Why not give an IP address in the beginning? In this way, the trouble of parsing can be saved. Let’s first understand what an IP address is.
IP address refers to internet protocol address, which is an abbreviation of ip address. IP address is a unified address format provided by IP protocol. It assigns a logical address to each network and each host on the Internet to shield the difference of physical addresses. The IP address is a 32-bit binary number, for example 127.0.0.1 is the local IP.
A domain name is equivalent to an IP address disguised as a pretender wearing a mask. Its function is to facilitate memory and communication of a group of server addresses. Users usually use host names or domain names to access each other’s computers, rather than directly through IP addresses.Because compared with a group of pure numbers of IP addresses, it is more in line with human memory habits to designate computer names in the form of letters and numbers. However, it is relatively difficult for computers to understand names. Because computers are better at handling a long series of numbers. In order to solve the above problems, DNS service came into being.
2. What is domain name resolution
DNS protocol provides the service of searching IP address through domain name or reverse searching domain name from IP address.DNS is a network server, our domain name resolution is simply to record an information record in DNS..
For example, baidu.com 18.104.22.168 (server external network IP address) 80 (server port number)
3. How does the browser query the IP corresponding to the URL through the domain name
- Browser Cache: The browser caches DNS records at a certain frequency.
- Operating system cache: if the required DNS records cannot be found in the browser cache, go to the operating system.
- Routing cache: routers also have DNS cache.
- ISP’s DNS server: ISP is short for Internet Service Provider. ISP has a special DNS server to handle DNS query requests.
- Root server: if the ISP DNS server cannot be found, it will send a request to the root server for recursive query (DNS server first asks the IP address of the root domain name server. com domain name se rver, then asks. baidu domain name server, and so on)
The browser sends the domain name to the DNS server, and the DNS server queries the IP address corresponding to the domain name and returns it to the browser. The browser types the IP address on the pr ot ocol, and the request parameters are also loaded on the protocol and then sent to the corresponding server. Next, the stage of sending http request to server is introduced. HTTP request is divided into three parts: TCP three-way handshake, HTTP request response information and closing TCP connection.
Third, TCP three-way handshake
Before the client sends the data, it will initiate TCP three-way handshake to synchronize the serial number and confirmation number of the client and the server, and exchange TCP window size information..
1. The process of 1.TCP three-way handshake is as follows:
- The client sends a packet with SYN=1 and Seq=X to the server port(The first handshake was initiated by the browser and told the server that I was about to send a request)
- The server sends back a response packet with SYN=1, ACK=X+1, Seq=Y to convey the acknowledgement information(The second handshake was initiated by the server and told the browser that I was ready to accept it. Please send it quickly.)
- The client sends back another data packet with ACK=Y+1 and Seq=Z, which means “handshake ends.”(The first three-way handshake, sent by the browser, told the server that I will send it immediately, ready to accept it)
2. Why do you need three-way handshake
The purpose of “three-way handshake” in Xie Xiren’s “Computer Network” is “In order to prevent invalid connection request message segments from being suddenly transmitted to the server, thus generating errors”。
4. Send HTTP Request
After TCP three-way handshake ends, start sending HTTP request message.
The request message consists of a request line, a request header, and a request body, as shown in the following figure:
1. The request line contains the request method, URL and protocol version
- There are 8 request methods: GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS, TRACE.
- The URL is the request address, which is determined by < protocol >://< host >: < port >/< path >? < parameters > composition
- The protocol version is the http version number
POST /chapter17/user.html HTTP/1.1
In the above code, “POST” represents the request method, “/chapter17/user.html” represents the URL, “HTTP/1.1” represents the protocol and the version of the protocol. Now the more popular version is Http1.1
2. The request header contains the additional information requested and consists of keyword/value pairs, one pair per line. The keyword and value are separated by an English colon “:”.
The request header informs the server that there is information about the client request. It contains a lot of useful information about the client environment and the request body. For example:Host, which means host name, virtual host; Connection, added to HTTP/1.1, uses keepalive, that is, persistent connection, a connection can send multiple requests; User-Agent, Request Sender, Compatibility and Customization Requirements.
3. Request body, which can carry data of multiple request parameters, including carriage return character, line feed character and request data, not all requests have request data.
The above code carries three request parameters: name, password and realName.
5. The server processes the request and returns HTTP message
A server is a high-performance computer in a network environment. It listens to service requests submitted by other computers (clients) on the network and provides corresponding services, such as web services, file download services, mail services, and video services. The main functions of the client are browsing the web, watching videos, listening to music and so on, which are completely different. ——web server, an application that handles requests, is installed on each server. Common web server products include apache, nginx, IIS or Lighttpd.
2.MVC background processing stage
Background development now has many frameworks, but most of them are still built according to MVC design pattern.
MVC is a design pattern, which divides an application program into three core components: model)– view)– controller, each of which processes its own tasks and realizes the separation of input, processing and output.
It is the operating interface provided to users and the shell of the program.
The model is mainly responsible for data interaction.Among the three components of MVC, the model has the most processing tasks. One model can provide data for multiple views.
It is responsible for selecting the data in the “model layer” according to the instructions input by the user from the “view layer” and then carrying o ut corresponding operations to produce the final result.The controller belongs to the manager role. It receives the request from the view and decides which model component to call to process the request, and then decides which view to use to display the data returned from the model process.
These three layers are closely linked, but they are independent of each other. Changes in each layer do not affect the other layers. Each layer provides an Interface for the upper layer to call.
As for what happened at this stage? In short,First, the request sent by the browser goes through the controller, which carries out logic processing and request distribution. Then, the model will be called. In this stage, the model will obtain the data of redis db and MySQL. After obtaining the data, the rendered page will be returned to the client in the form of response message. Finally, the browser will present the page to the user through the rendering engine.
3.http response message
The response message consists of a response line, a response header, and a response body. As shown in the following figure:
(1) The response line includes: protocol version, status code and status code description
The status code rules are as follows:
1xx: Indicator Information-Indicates that the request has been received and processing continues.
2xx: success-indicates that the request has been successfully received, understood and accepted.
3xx: Redirection-Further actions are required to complete the request.
4xx: client error-request has syntax error or request cannot be implemented.
5xx: server-side error-server failed to fulfill legitimate request.
(2) The response header contains additional information of the response message and consists of name/value pairs
(3) The response body includes carriage return character, line feed character and response return data. Not all response messages have response data
Six, browser analysis rendering page
After the browser gets the response text HTML, the following is an introduction to the browser rendering mechanism
The browser analysis rendering page is divided into the following five steps:
- Parsing DOM Tree Based on HTML
- Generating CSS rule tree according to CSS parsing
- Combining DOM tree and CSS rule tree to generate rendering tree
- According to the rendering tree
- Drawing a page according to the calculated information
1. parse DOM tree according to HTML
- According to the content of HTML, tags are parsed into DOM trees according to their structures. The parsing process of DOM trees is a depth-first traversal. That is, all child nodes of the current node are constructed first, and then the next sibling node is constructed.
- In the process of reading HTML documents and constructing DOM tree, if script tag is encountered, the construction of DOM tree will be suspended until the script is executed.
2. Generate CSS rule tree according to CSS parsing
- Js execution will be suspended when parsing the CSS rule tree until the CSS rule tree is ready.
- The browser will not render until the CSS rule tree is generated.
3. Combining DOM tree and CSS rule tree to generate rendering tree
- The browser will not start building the rendering tree until the DOM tree and CSS rule tree are all ready.
- Streamlining CSS can speed up the construction of CSS rule tree, thus speeding up the corresponding page speed.
4. Calculate the information (layout) of each node according to the rendering tree
- Layout: calculate the position and size of each rendered object through the information of the rendered objects in the rendering tree
- Reflow: After the layout is completed, it is found that a certain part has changed and affected the layout, so it needs to be rewound and re-rendered.
5. Draw the page according to the calculated information
- In the drawing phase, the system will traverse the rendering tree and call the renderer’s “paint” method to display the renderer’s contents on the screen.
- Redrawing: The background color and text color of an element will not affect the attributes of the layout around or inside the element, but will only cause the browser to redraw.
- Reflow: When the size of an element changes, the rendering tree needs to be recalculated and rendered again.
When the data transmission is completed, the tcp connection needs to be disconnected, and tcp waves are initiated four times at this time.
- The initiator sends a message to the passive party, Fin, Ack, and Seq, indicating that there is no data transmission. And enter the FIN_WAIT_1 state. (First wave: initiated by the browser, sent to the server, I request the message to be sent out, you are ready to close it)
- The passive party sends a message, Ack, Seq, indicating approval of the shutdown request. At this time, the host initiator enters the FIN_WAIT_2 state. (Second Wave: Initiated by the server, tell the browser that I have accepted the request message and I am ready to close it, so are you)
- The passive direction sends message segments Fin, Ack and Seq to the initiator, requesting to close the connection. And enter the LAST_ACK state. (Third Wave: Initiated by the server, tell the browser that my response message has been sent and you are ready to close it)
- The initiator sends message segments, Ack and Seq, to the passive party. Then enter the waiting TIME_WAIT state. The passive party closes the connection after receiving the message segment from the i nitiator. If the initiator does not receive a reply after waiting for a certain period of time, it will shut down normally.. (Fourth Wave: Initiated by the browser, tell the server that my response message has been accepted and I am ready to close, so are you)
To recommend a useful BUG monitoring toolFundebug, welcome to try free!
- What happened from entering the page address to displaying the page information?
- The front-end classic question: What happened from entering URL to page loading?
- TCP’s three-way handshake Wave Four Times
- Access to the Web, tcp transmission process (three-way handshake, request, data transmission, four waves)
- Analysis of http Request Process Sent by Browser
- Xie Xiren Author of “Computer Network” 4th Edition
- Graphical http