This article was originally published on Search Engine Journal in December 2019.
https://www.searchenginejournal.com/rendering-seo-introduction/330399/
Let’s start this out with a bang.
Googlebot isn’t what you think.
SEO professionals refer to Googlebot with a strange form of reverence reserved in prior generations for all-knowing deities and unseen powers.
It’s dramatic, gives flair to a story, but oversimplifies the true identity of Googlebot.
Googlebot is simply a user-agent. It is the identifier of a request – a fancy version of caller ID.
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/W.X.Y.Z‡ Safari/537.36
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z‡ Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Once the request receives a response, Googlebot’s job is over and it’s off to request the next URI. The collected response runs through a series of services and processes before it appears in SERP.
The scrappy user-agent gets all the glory, but we need to talk about the heavy-hitter, the hidden construct that builds your site for Google to experience it as a human would: rendering.
What Is Rendering?
Rendering is the process where Googlebot retrieves your pages, runs your code, and assesses your content to understand the layout or structure of your site.
All the information Google collects during the rendering process is then used to rank the quality and value of your site content against other sites and what people are searching for with Google Search.
Every Webpage Has Two States – Rendering Occurs Between Them
Every webpage has two states:
- The initial HTML which becomes the Crawled DOM
- The rendered HTML, which becomes the Rendered DOM
A website can be very different between the two states.
The initial HTML occurs first. It is the response from the server. In it is HTML and links to resources like JavaScript, CSS, and images that are needed to build to the page.
Google builds the first version of the DOM from the initial HTML. To see the initial HTML for yourself, view the page source.
Rendered HTML is more widely known as the DOM, an abbreviation of Document Object Model. Every webpage has a DOM. It represents the initial HTML plus any changes made by JavaScript that HTML called on.
Google builds the Rendered DOM from the rendered HTML. This version overrides the Crawled DOM. To view the DOM, open browser’s developer tools in your browser and click the console tab.
If you’re looking to easily spot the difference between the two, tools like Chrome extension View Rendered Source will highlight lines that change from one state to the other.
Knowing the Difference Between HTML & DOM Is the Key to Troubleshooting JS SEO
When content changes between initial HTML and DOM, it’s client-side JavaScript changing page. (JavaScript can be executed elsewhere, but we’ll address that later.)
These changes indicate that JavaScript is being executed in the user’s browser. When JavaScript executed in the user’s browser, we call it Client-Side Rendering (CSR).
This puts your webpage at risk. If something goes wrong during execution, those JavaScript changes might never happen. JavaScript is a complex process and the most expensive resource on your site.
Sounds like a dev problem, right?
It isn’t.
SEO professional have substantial skin in the game.
Google Cannot Index What It Cannot Render
In order to rank, we have to be indexed. In order to be indexed, we have to be rendered.
If content can’t be rendered, then it doesn’t contribute to how Google understands or elevates your site.
Let’s look at a site with happy, healthy JavaScript.
Everything seems in order. This appears to be an authoritative ecommerce site that knows a lot about its subject matter.
Now let’s take away the content generated by JavaScript. You can do this on any site by blocking JavaScript in Site Settings.
Oof.
All of the products that highlighted the site’s authority are gone.
It’s the difference between saying we know books and showing the audience that your site really knows.
Imagine if that was content you’d worked hard to optimize and wanted to rank for.
This is one of the better case scenarios.
Even without JavaScript, we still have a basic idea of what intent this page is trying to fulfill. We know the brand name and can still find useful links to other pages on the site.
Let’s play a game.
Open a page and view the page source. Can you tell what the site is about?
If you can’t identify what a page is about and what type of search intent it matches based on the initial HTML, neither can a search engine.
The page will have to go through the rendering process to be understood.
How Google Renders (A Rough Sketch)
Rendering isn’t the Hellmouth or some Lovecraftian void. It’s a Super Mario Bros. level.
As difficult as it can be, there are clear steps and checkpoints.
Here’s the process with helpful hands-on steps so you can follow along!
- A URL is pulled from the crawl queue
- Follow along: Pick a page, any page.
- Googlebot requests the URL and downloads the initial HTML
- Follow along: View page source
- The Initial HTML is passed to the processing stage (First wave of processing by Google’s indexing service)
- Follow along: Can you tell what this page is about?
- The processing stage extracts links from the initial HTML
- Follow along: Open the network tab in Chrome developer tools and look at the total number of requests. Each of these requests counts toward your crawl budget.
- Follow along: Open the network tab in Chrome developer tools and look at the total number of requests. Each of these requests counts toward your crawl budget.
- These links go back on the crawl queue
- Follow along: Open up each resource. One by one. No cheating. Ask yourself how that resource contributes value. About 20 resources in, you’ll become annoyed. 50 resources in, begin wondering how these things contribute in any meaningful way. 80 resources in, begin to understand the unnecessary nonsense the site is shipping. Protip: Keep each new tab for each and slowly watch your sanity slip away.
- Once resources are crawled, the page queues for rendering
- Follow along: Blink. You may have forgotten how to.
- When resources become available, the request moves from the render queue to the renderer
- Follow along: Try to find where your original tab went.
- The rendering service assembles the page using the crawled links
- Follow along: View the DOM by opening Developer’s Tools and view the resources. Alternatively, if you’ve lost your original tab: Use Google Search Console’s URL Inspector to render the page. The tool executes both crawling and rendering simultaneously.
- Renderer passes the rendered HTML back to processing
- Follow along: View the rendered HTML available in GSC.
- Second wave of processing for Google’s index
- Follow along: Can you tell what the page is about? Is the content so rich and valuable it justifies all the tabs you had to open to get here?
- Extracts links from the rendered HTML to put them into the crawl queue
- Follow along: Look for links available in the rendered HTML that weren’t available in the server response.
- Go to the next URL in your list and repeat the process.
Fantastic job! Only 130 trillion pages more and you’ll be a proper bot!
How to Make Rendering More Effective & Less Painful
Now that you’ve experienced the rendering process hands-on (I’m sorry, and you’re welcome), let’s talk about how to make the experience less painful.
1. Be Aware of How You Deliver Content
The more client-side resources you use, the more places there are for things to go wrong.
Imagine you really are Googlebot.
Did any of those resources given an error when requested?
Any content that resource created is lost in the couch cushion of the internet now.
Hope it wasn’t important.
2. Skip the Rendering Queue & Deliver Critical Content in the Server Response
JavaScript has to execute somewhere. For the most part, it’s either on your server or in the user’s browser.
When we execute JavaScript server-side, we’re able to ship the result (the rendered content) to the user in the initial HTML.
Many JavaScript frameworks like Angular and React have these functionalities natively available.
Getting your content rendered server-side involves working with your developers and learning about your code-base.
It’s important to understand that you don’t have to be 100% client-side or 100% server-side. Instead, focus on shipping what matters when it matters.
Critical here means why the user came to the page. You’ll need to define for your site by page template.
Elements like supplemental content, site footer, and offscreen items can wait.
JSON-LD in your initial HTML is a great way to pass Googlebot a cheat sheet, but make sure you have the content the user cares about there as well.
3. Ship Only the Scripts You Need
In 2019, the dominant costs of scripts are now download and CPU execution time.
Every script called has to be downloaded, parsed, compiled, and executed– regardless of whether it contributes to the content of the page.
Google Chrome has built-in functionality to help you see how much of your code is used.
How to Spot Wasteful Scripts
- Open Developer’s Tools.
- Click the 3 dots in the upper right corner.
- Select more tools, Coverage.
- Reload the page.
As a goal, a healthy, effective page should be less than 1MB.
Chances are that portly, poor performing landing page could shed some scripts. If you find excessive scripts, work with your dev team to code split.
4. Prioritize the Human Experience over Shiny Features
Your inbox is likely full of offers to try out new AI-powered tools with proprietary metrics that rank your site visibility in unicorns.
If you already measure performance using 15 other tracking pixels, maybe a new narwhal cube score isn’t necessary.
Third-party scripts can negatively impact performance, rendering, security, and user privacy.
Think of loading a third-party script as giving someone your house key.
5. Lazy Images & Scripts Without Blocking Rendering
A picture is worth a thousand words, right?
Here’s the thing. 1,000 words is about 2kb.
According to HTTPArchive, images are the most requested asset and average 900kbs of requests.
Lazy loading is natively supported as of Chrome 76. Simply add the attribute loading="lazy"
to deliver maximum value with the smallest dev ticket possible.
Similarly, you can also load scripts asynchronously by adding a simple attribute: <rel=”myscript.js” async>
Alternatively, you can defer the script– essentially telling it to run last. <rel=”myscript.js” defer>
6. Keep Script Bundles Small
If your script is larger than 50–100 kB, split it up into separate smaller bundles.
Multiple smaller bundles are more effective than a single large script package.
If your site uses HTTP/2 multiplexing, multiple requests and responses can be in flight at the same time.
7. Cache, Cache, Cache
Remember the follow-along exercise above? Imagine having to go back for a reusable JS resource each time. That extra step can be easily avoided by caching resources as long as possible.
If you break out your JS into smaller bundles dedicated to a specific function, they’ll be easier to cache for long periods of time.
Read up on Google’s Web Fundamentals then sit down with your engineers to get insight into how and what you currently cache.
8. Performance & Rendering Are Directly Related
Google uses Chromium to render for a number of reasons. One of which is that it can capture critical timings – everything from Time to First Byte (TTFB) to Time to Interactive (TTI).
The data it captures in loading your page helps to inform everything from how mobile-friendly your design is to how fast. Both of these are factors for ranking.
The more efficient and performant your resources are, the most effectively the page can be rendered.
Lighthouse is a free testing tool that can help you identify performance bottlenecks.
If you’re looking to dive deeper into Lighthouse’s performance metrics, this guide breaks down the metric and its components.
9. Remember That No Piece of Technology Is Inherently Good or Bad
JavaScript is a tool and has an effective application creating rich interactive and personalized experiences. A hammer is also a tool.
Hammers are great for hanging pictures, hammers are great for working with nails but that doesn’t make a hammer ideal for at-home pedicures.
Know the difference between your nails. Don’t blame the tool.
What’s the Best Way to Render? It Depends…
It depends on the technologies you use. It depends on what your business goals are.
It depends isn’t a write-off answer for a question Google doesn’t want to answer.
Technology is nuanced. Rendering is one of many processes that happen in the spaces in between.
The behaviors we don’t see can have a larger impact on our site than all the on-page optimization you can keyword stuff.
This is your call to action. Be fiercely curious. Ask questions.
Be in uncomfortable meetings with teams you don’t understand. Ask dumb questions.
Look like a fool in front of subject matter experts. It shows you’re willing to learn.