The failed website still returned the source code of the webpage. In my understanding, a browser without an interface does not just take back the source code of the web page and render it into an HTML structure to return to the caller. Why do some websites succeed and others fail? The unsuccessful websites are Baidu Pictures and Douban Films.
Some contents are loaded dynamically-simulating human browsing actions, such as scrolling the mouse wheel to display the contents of the next part ~ ~ ~
Headless browsers don’t and don’t know that they need to do these actions, so they won’t be able to get the actual html code.
Specific page specific analysis, write some scripts to simulate human browsing action is good to solve the problem, ScraperJS has a response to the call interface of ha ~ ~ ~