This post is inspired from Nasko’s post Chromium Internals - Lifetime of a navigation and offers a deeper insight into the process of a navigation.
It is not so easy to answer this question precisely since navigation is the core function of a browser. To have a better understanding of what a navigation is, let’s start from HTML5 specification.
It’s astonishing for my first time to learn that there existed two HTML5 versions for a long time until May 2019. Briefly speaking, W3C expects the specification to be versioned periodically like HTML4.01 while WHATWG wants a much more flexible one called “Living Standard”. Currently, W3C and WHATWG works together on the Living Standard which I refer to in this post.
Many people would think the HTML5 specification as a standard for HTML syntax and features. That’s true indeed, though, things becomes much more complex than expected when it comes to details. For example, one possible value for the
sandbox attribute of an iframe is
allow-top-navigation, which makes it inevitable to define the navigation and many other concepts, like
Window. It even contains many implementation details like
event loop and
task queue. In short, current HTML specification is more like a cookbook for building the whole web platform, not a simple language specification. It enhances the compatibility between browsers and users can share similar experience across different browsers.
The key concept related to our topic is
browsing context, which is called a
Frame in Chromium. Usually, a tab or iframe has a dedicated browsing context, which has a corresponding window and document. Roughly speaking, a navigation, or navigating a browsing context means switching its corresponding document to another one. Although it sounds very simple, in fact there are massive corner cases to handle and even the specification is not so perfect.
Chromium generally follows the navigation defined in the specification but classifies the navigation in a more proper way. Based on the destination document, the navigation is divided into same-document navigation and cross-document navigation. Most time, Chromium deals with cross-document navigation except scenarios below:
- The URL is trying to navigate Chromium to a fragment in current page, like this.
- The navigation is caused by History API.
On the other hand, based on the initiator, the navigation is also divided into browser-initiated navigation and renderer-initiated navigation. Note that here “browser” and “renderer” refer to the browser and renderer process. The browser process is the process which users interact with while the renderer process typically is responsible for parsing document, rendering page and executing user scripts. A detailed explanation is documented at here.
A browser-initiated navigation means the navigation is triggered from the browser process, like entering a URL in address bar and hitting enter while a renderer-initiated navigation is the opposite, like assigning a new location to
Next, we will inspect a navigation to
https://example.com initiated from omnibox to show the lifetime of a navigation.
The logic about the omnibox is located in
components/omnibox/browser. It’s not so difficult to find that our target function is
OmniboxEditModel::AcceptInput after reading some headers. It retrieves user input and finds a
match, like matching “https://google.com" from “https://goog", “https://google.com/search?q=Lazymio" from “Lazymio”. After that, it asks the omnibox controller to open such URL.
Note that the user input is handled in UI thread and thus these functions are executed synchronously. Then, the omnibox controller asks the browser to open the URL.
And the browser starts to load contents.
Then the navigation controller involves, it determines the navigation type, handle a bunch of special URLs, finds the target frame which needs navigating and the most importantly, creates a corresponding
In fact, our navigation starts from here.
NavigationRequest tracks different stages of the navigation and contains the core logic of the navigation. The node which needs navigating will hold this
NavigationRequest until it commits so the request will be bind to the node shortly after being created.
After the navigation starts, the very first thing to do is to unload current document, which usually fires
beforeonload event. For example, when exiting a page with some drafts, the website may prompt you with
Changes you made may not saved. Note that for same-document navigation, the
beforeonload event won’t be fired and the browser will call
beforeunload, an IPC message is created and sent to the corresponding renderer. After the renderer finishes everything, it calls the browser to proceed the navigation by another IPC message.
Then the browser calls
NavigationRequest::BeginNavigation, but before a network request is built and sent, several navigation throttlers will decide whether the navigation should be proceeded, deferred or stopped.
If the navigation passes through throttlers, it’s time to bind a renderer to the
NavigationRequest and build a network request.
After the URL loader
loader_ is created, the network request will be sent and processed asynchronously. Note that the delegate pattern is applied in the implementation of
NavigationURLLoader and the last but two argument
NavigationRequest itself, is the delegate of the new
loader_. Therefore, after the
loader_ finishes loading the documents, it informs its delegate, the
NavigationRequest by calling
So this is the end of
BeginNavigation phase. It seems to do lots of heavy work, though, almost every operation is asynchronous since it lives in UI thread.
When the response arrives (for simplicity, assume that there is no redirection),
NavigationRequest::OnResponseStarted is called and the next thing is to commit this navigation. Similarly, before committing, several navigation throttlers decide the result of the navigation.
And finally, after tons of security checks, in
RenderFrameHostImpl::CommitNavigation is called to commit the navigation (with another tons of checks). And the browser will tell the corresponding renderer to commit the navigation by sending an IPC message.
Now it’s time to focus on the renderer. After receiving the request to commit the navigation, the renderer firstly creates a body loader.
Then it creates a document loader and commits it.
Before the document loader really starts, it tells the browser process that it has finished committing the navigation.
And the document loader finally asks the body loader to load the body.
After that, the renderer starts to parse HTML and build DOM tree asynchronously. Let’s go back to the browser process.
The first message to arrive is that the renderer has committed the navigation, which invokes
DidCommitNavigation in various objects. The most important thing here is to update the origin of the navigated frame. In other words, if a navigation is cancelled before
DidCommitNavigation, the origin of the frame remains correct at least, which ensures that same-origin policy won’t be compromised.
DidCommitNavigation finishes, the navigation is done. From users’ view, almost nothing changes at this time, though, the browser has already committed the navigation and the next step is to load the document. After the renderer informs the frame that the loading is finished, the navigation (in a broad sense) finally ends.
After some reworks like PlzNavigate, the navigation design in Chromium is quite clear and decent. Below is a screenshot from Chrome University 2018: Life of a Navigation, a presentation made by Nasko, which illustrates how navigation works in a very high level.