Skip to main content

How I make a web-components based dynamic Javascript page at the top of Google Search

SEO

Introduction


Everybody wants their website shown at the first position of Google search. SEO (Search Engine Optimization) is a big topic. I just helped my client's website shows the database records at the top search rankings (at least several Chinese generic keywords). See the three example questions all are listed at top ranking:
screenshot

Website background: My client's website popa.qa is a traditional Chinese Q&A site that lets members ask and answer questions. All answers and questions are storing in the database server.

Step 1: Create The Project

This blog illustrates the problems and the steps to fix that with project source codes. Below is the description of the basic project:
  • NodeJS backend (server.js)
    • Develop an API (/get-database-records) to simulate getting database records
  • Web-components frontend (index.html)
    • An example component IndexPage make use of LitElement to render database records

To start the server type: npm start

Then I check the webpage speed using the Lighthouse where the Chrome browser provides:
screenshot

The performance score of 84 is not bad. It is because web-components using shadow DOM to make it very high performance. However, the problem is empty contents. To illustrate this, right-click the webpage and select "View page source". OMG, I see nothing in the HTML contents:
screenshot

This problem happens for all client-side rendering frameworks include React, Angular, and even jQuery. So this blog is titled "Dynamic Javascript page". It affects SEO significantly. To solve this problem, I chose the Headless Chrome solution.

Step 2: Using Headless Chrome to generate static page

I was lucky because there is Puppeteer: a Headless Chrome library for Node application. It's easy to install, type: npm install --save puppeteer

Add puppeteer to the server.js (at top of the file):
...
const puppeteer = require('puppeteer');
...

To make use of puppeteer, I create an API to trigger it:
... above the app.listen() in server.js
// Call this API to generate the static page.
app.get('/gen-static-pages', async (req, res) => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(`${req.protocol}://${req.get('host')}/index.html`, {waitUntil: 'networkidle2'});
let html = await page.content(); // Headless Chrome returns the HTML contents
fs.writeFileSync('index-static.html', html); // Save the contents to file
await browser.close();

res.end('Static files generated');
});
...

Start the server again: npm start

Open the browser and call the API: http://localhost:3000/gen-static-pages. Then the static file index-static.html should be generated and show the successful text:
screenshot

Open the public/index-static.html file and see the inside HTML:
screenshot
Oh, what!? Why there is still empty!? After I check out the LitElement documentation, I figure out there needs a trick.

Step 3: Disabling shadow DOM for some criteria

The empty contents problem is caused by the shadow DOM. But I don't want to disable it anytime. Shadow DOM benefits component-based programming. So I added criteria in the URL query parameter (crawler=1) to let the frontend disable it. Add below function into the public/index.js:
// constructor()

createRenderRoot() {
const urlParams = new URLSearchParams(window.location.search);
let isCrawler = urlParams.get('crawler') == 1;
if (isCrawler) {
return this;
} else {
return super.createRenderRoot(); // Same as: return this.attachShadow({mo$
}
}
// connectedCallback()

Then add this parameter at the URL when calling puppeteer:
app.get('/gen-static-pages', async (req, res) => {
// ...
await page.goto(`${req.protocol}://${req.get('host')}/index.html?crawler=1`, {waitUntil: 'networkidle2'});
// ...
});


Then re-run the API to generate the static page and check the contents:
  • Restart the server: npm start
  • To trigger the API, browse: http://localhost:3000/gen-static-pages
  • Browse the result static page: http://localhost:3000/index-static.html
Then "View page source" to see the HTML:
screenshot

It works! Now the problem of the empty contents is fixed.

Further Optimization

It's now over. It cannot be at the top search ranking. For my client popa.qa, I've implemented additional optimization on the static file generation:
  1. Changed all the external stylesheets to embedded stylesheets.
  2. Removed all webfonts download (Google Crawler does not care about font style).
  3. Removed all Google Analytics and Facebook Pixel libraries (because Google Crawler is just a bot).
  4. Removed all Youtube iframe tags.
  5. Refactored the backend server App from NodeJS codes to Cloudflare Workers for highest performance and fewest network latency.
  6. Setup an individual cloud server to execute Puppeteer periodically via Cron job. The job pre-render all dynamic pages to static pages and upload it all to Cloudflare KV store.
  7. Inside the Cloudflare Workers server app, it can distinguish the page visitor is a human or a bot through user-agent detection. When it is human, divert to conventional logic. When it's a bot, just return the static pages from KV store.
  8. Implemented Structured Data Markup for QAPage.
It's a lot of efforts to make it happens, the return is excellent performance and top search ranking:
lighthouse-mark

Not believe? Please try it yourself. Go to Google PageSpeed Insights and enter one of Popa page URL to see the result:


Please leave comments if you have any questions, or drop a request for the additional source codes for the Cloudflare Workers codes.

Enjoy! 😃

Comments

  1. Harrah's Reno Casino & Hotel - Mapyro
    Find 서귀포 출장안마 Harrah's Reno 하남 출장마사지 Casino & 포항 출장샵 Hotel (Stateline, NV) location in United States. Get directions, reviews 진주 출장샵 and information for Harrah's Reno 오산 출장안마 Casino & Hotel in

    ReplyDelete

Post a Comment

Popular posts from this blog

Create An Online Store Theme Used By MyCMS

MyCMS is an open-source Content Management System to generate static online shop website. You can use my hosting to input your products, or you can download the source codes and host it in your own server (running NodeJS). Please refer to my Github repo for the detailed installation instructions. This blog is a step-by-step tutorial that shows you how to create an online-shop theme. In this tutorial, it’s using my hosting to input the shop details and products. If you’re hosting the MyCMS by yourself, just change the domain name to yours will do. Introducing MyCMS Before making the theme, you’ll need to use MyCMS to configure the demo shop and input two demo products. MyCMS generates a static website via a theme. The generated static website is NO server program required. You can put the website files (HTML/CSS/JS) to any CDN, hosting. Shop Configuration You must prepare below settings Before using MyCMS: Setting Description Example Store name Your store name will be displayed in t

Build A Simple Sales System using Google Drive

Google Drive is a free-to-use cloud service with a 15 GB limit storage. 15 GB is large enough to store documents, and "program" too. Yes, it's correct. We can program Google Drive. This blog is a step-by-step tutorial to show you how to build a simple P.O.S. system on Google Drive. Create a Google Sheets to define the products 1. Login to Google Drive using your Gmail account and create a Google Sheets: 2. Input the products in the Google Sheets Create a Google Forms for sale input 1. Create a new Google Form 2. Edit the form and the first field Note that leave the option field unchanged. We'll assign it programmatically in below 3. Add a quantity field 4.  Add a last field for the price input How to copy data from Google Sheets to Google Forms? We need to add a script to do that. Google developed Google App Script (GAS) for us to achieve that. Now go back to the Google Sheet we just created above. Then enter to the script editor You'll see the below screen. Then