Exporting a web page as PDF or PNG from Node.js

pdf or png from nodejs.png

This blog post was first published in thedreaming.org blog.


I’m working on a reporting engine right now, and one of the requirements I have is to be able to export reports as PDF files, and send PDF reports via email. We also have a management page where you can see all the dashboards, and we’d like to render a PNG thumbnail of the dashboard to show here. Our back-end is all Node.js based, and you’re probably aware that Node is not known for it’s graphics prowess. All of our reports are currently being generated by a React app, written with Gatsby.js, so what we’d like to do is load up a report in a headless browser of some sort, and then either “print” that to PDF or take a PNG image. We’re going to look at doing exactly this using Playwright and Browserless.

The basic idea here is we want an API endpoint we can call into as an authenticated user, and get it to render a PDF of a web page for us. To do this we’re going to use two tools. The first is playwright, which is a library from Microsoft for controlling a headless Chrome or Firefox browser. The second is browserless, a microservice which hosts one or more headless Chrome browsers for us.

If you have a very simple use case, you might actually be able to do this entirely with browserless with their APIs. There’s a /pdf REST API which would do pretty much what we want here. We can pass in cookies and custom headers to authenticate our user to the reporting backend, we can set the width of the viewport to something reasonable (since of course our analytics page is responsive, and we want the desktop version in our PDF reports, not the mobile version).

But, playwright offers us two nice features that make it desirable here. First it has a much more expressive API. It’s easy, for example, to load a page, wait for a certain selector to be present (or not be present) to signal that the page is loaded, and then generate a PDF or screenshot. Second, when you npm install playwright, it will download a copy of Chrome and Firefox automatically, so when we’re running this locally in development mode we don’t even need the browserless container running.

Setting up and configuring playwright

In our backend analytics project, we’re going to start out by installing playwright:

$ npm install playwright

You’ll notice the first time you do this, it downloads a few hundred megs of browser binaries and stores them (on my mac) in ~/Library/Caches/ms-playwright. Later on, when we deploy our reporting microservice in a docker container, this is something we do not want to happen, because we don’t need a few hundred megs of browser binariers - that’s what browserless is for. So, in our backend project’s Dockerfile, we’ll modify our npm ci command:

RUN PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 npm ci && \
    rm -rf ~/.npm ~/.cache

If you’re not using docker, then however you run npm install in your backend, you want to add this PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD environment variable to prevent all the browser binaries from being installed.

We do want playwright to use these browser binaries when it runs locally, but in production we want to use browserless. To control this, we’re going to set up an environment variable that controls playwright’s configuration:

// Set BROWSERLESS_URL environment variable to connect to browserless.
// e.g. BROWSERLESS_URL=wss://chrome.browserless.io?token=YOUR-API-TOKEN
const BROWSERLESS_URL = process.env.BROWSERLESS_URL;

let browser: pw.Browser | undefined;

export async function getBrowser() {
  if (!browser) {
    if (BROWSERLESS_URL) {
      // Connect to the browserless server.
      browser = await pw.chromium.connect({
        wsEndpoint: BROWSERLESS_URL,
      });
    } else {
      // In dev mode, just run a headless copy of Chrome locally/
      browser = await pw.chromium.launch({
        headless: false,
      });
    }

    // If the browser disconnects, create a new browser next time.
    browser.on("disconnected", () => {
      browser = undefined;
    });
  }

  return browser;
}

This creates a shared “browser” instance which we’ll use across multiple clients. Each client will create a “context”, which is sort of like an incognito window for that client. Contexts are cheap and lots can exist in parallel, whereas browser instances are expensive, so this lets us use a single instance of Chrome to serve many users.

We’re also going to write a little helper function here to make it easier to make sure we always release a browser context when we’re done with it:

async function withBrowserContext<T = void>(
  fn: (browserContext: pw.BrowserContext) => Promise<T>
) {
  const browser = await getBrowser();
  const browserContext = await browser.newContext();

  try {
    return await fn(browserContext);
  } finally {
    // Close the borwser context when we're done with
    // it. Otherwise browserify will keep it open forever.
    await browserContext.close();
  }
}

The idea here is you call this function, and pass in a function fn that accepts a browser context as a parameter. This lets us write code like:

await withBrowserContext((context) => {
  // Do stuff here!
});

And then we don’t have to worry about cleaning up the context if we throw an expection or otherwise exit our funciton in an “unexpected” way.

Generating the PDF or PNG

Now we get to the meaty part. We want an express endpoint which actually generates the PDF or the PNG. Here’s the code, with lots of comments. Obivously this is written with my “reporting” example in mind, so you’ll need to make a few small changes if you want to give this a try; notably you’ll probably want to change the express endpoint, and change the dashboardUrl to point to whatever page you want to make a PDF or PNG of.

// We set up an enviornment variable called REPORTING_BASE_URL
// which will point to our front end on localhost when running
// this locally, or to the real frontend in production.
const REPORTING_BASE_URL =
  process.env.REPORTING_BASE_URL || "http://localhost:8000";

// Helper function for getting numbers out of query parameters.
function toNumber(val: string | undefined, def: number) {
  const result = val ? parseInt(val, 10) : def;
  return isNaN(result) ? def : result;
}

// Our express endpoint
app.post("/api/reporting/export/:dashboardId", (req, res, next) => {
  Promise.resolve()
    .then(async () => {
      const type = -req.query.type || "pdf";
      const width = toNumber(req.query.width, 1200);
      const height = toNumber(req.query.height, 800);

      // Figure out what account and dashboard we want to show in the PDF.
      // Probably want to do some validation on these in a real system, although
      // since we're passing them as query parameters to another URL, there's
      // not much chance of an injection attack here...
      const qs = new URLSearchParams({ dashboardId: req.params.dashboardId });

      // If you're not building a reporting engine, change this
      // URL to whatever you want to make into a PDF/PNG. :)
      const dashboardUrl = `${REPORTING_BASE_URL}/dashboards/dashboard?${qs.toString()}`;

      await withBrowserContext(async (browserContext) => {
        // Set up authentication in the new browser context.
        await setupAuth(context.req, browserContext);

        // Create a new tab in Chrome.
        const page = await browserContext.newPage();
        await page.setViewportSize({ width, height });

        // Load our dashboard page.
        await page.goto(dashboardUrl);

        // Since this is a client-side app, we need to
        // wait for our dashboard to actually be visible.
        await page.waitForSelector("css=.dashboard-page");

        // And then wait until all the loading spinners are done spinning.
        await page.waitForFunction(
          '!document.querySelector(".loading-spinner")'
        );

        if (type === "pdf") {
          // Generate the PDF
          const pdfBuffer = await page.pdf();

          // Now that we have the PDF, we could email it to a user, or do whatever
          // we want with it. Here we'll just return it back to the client,
          // so the client can save it.
          res.setHeader(
            "content-disposition",
            'attachment; filename="report.pdf"'
          );
          res.setHeader("content-type", "application/pdf");
          res.send(pdfBuffer);
        } else {
          // Generate a PNG
          const image = await page.screenshot();
          res.setHeader("content-type", "image/png");
          res.send(image);
        }
      });
    })
    .catch((err) => next(err));
});

As the comments say, this creates a new “browser context” and uses the context to visit our page and take a snapshot.

Authentication

There’s one bit here I kind of glossed over:

// Set up authentication in the new browser context.
await setupAuth(context.req, browserContext);

Remember that browserless is really just a headless Chrome browser. The problem here is we need to authenticate our user twice; the user needs to authenticate to this backend express endpoint, but then this is going to launch a Chrome browser and navigate to a webpage, so we need to authenticate the user in that new browser session, too (since obviously not just any user off the Internet is allowed to look at our super secret reporting dashboards). We could visit a login page and log the user in, but it’s a lot faster to just set the headers and/or cookies we need to access the page.

What exactly this setupAuth() function does is going to depend on your use case. Maybe you’ll call browserContext.setExtraHTTPHeaders() to set a JWT token. In my case, my client is authenticating with a session cookie. Unfortunately we can’t use setExtraHTTPHeaders() to just set the session cookie, because Playwright generates it’s own cookie header and we can’t override it. So instead, I’ll parse the cookie that was sent to us and then forward it on to the browser context: via browserContext.addCookies().

import * as cookieParse from "cookie-parse";
import { URL } from "url";

// Playwright doesn't let us just set the "cookie"
// header directly. We need to actually create a
// cookie in the browser. In order to make sure
// the browser sends this cookie, we need to set
// the domain of the cookie to be the same as the
// domain of the page we are loading.
const REPORTING_DOMAIN = new URL(REPORTING_BASE_URL).hostname;

async function setupAuth(
  req: HttpIncomingMessage,
  browserContext: pw.BrowserContext
) {
  // Pass through the session cookie from the client.
  const cookie = req.headers.cookie;
  const cookies = cookie ? cookieParse.parse(cookie) : undefined;

  if (cookies && cookies["session"]) {
    console.log(`Adding session cookie: ${cookies["session"]}`);
    await browserContext.addCookies([
      {
        name: "session",
        value: cookies["session"],
        httpOnly: true,
        domain: REPORTING_DOMAIN,
        path: "/",
      },
    ]);
  }
}

We actually don’t have to worry about the case where the session cookie isn’t set - the client will just get a screenshot of whatever they would have gotten if they weren’t logged in!

If we log in to our UI, and then visit http://localhost:8000/api/reporting/export/1234, we should now see a screenshot of our beautiful reporting dashboard.

Cleaning up site headers

If you actually give this a try, you might notice that your PDF has all your menus and navigation in it. Fortunately, this is easy to fix with a little CSS. Playwright/browserless will render the PDF with the CSS media set to “print”, exactly as if you were trying to print this page from a browser. So all we need to do is set up some CSS to hide our site’s navigation for that media type:

@media print {
  .menu,
  .sidebar {
    display: none;
  }
}

An alternative here might be to pass in an extra query parameter like bare=true, although the CSS version is nice because then if someone tries to actually print our page from their browser, it’ll work.

Setting up Browserless

So far we’ve been letting Playwright launch a Chrome instance for us. This works fine when we’re debugging locally, but in production, we’re going to want to run that Chrome instance as a microservice in another container, and for that we’re going to use browserless. How you deploy browserless in your production system is outside the scope of this article, just because there’s so many different ways you could do it, but let’s use docker-compose to run a local copy of browserless, and get our example above to connect to it.

First, our docker-compose.yml file:

version: "3.6"
services:
  browserless:
    image: browserless/chrome:latest
    ports:
      - "3100:3000"
    environment:
      - MAX_CONCURRENT_SESSIONS=10
      - TOKEN=2cbc5771-38f2-4dcf-8774-50ad51a971b8

Now we can run docker-compose up -d to run a copy of browserless, and it will be available on port 3100, and set up a token so not just anyone can access it.

Now here I’m going to run into a little problem. The UI I want to take PDFs and screenshots from is a Gatsby app running on localhost:8000 on my Mac. The problem is, this port is not going to be visible from inside my docker VM. There’s a variety of ways around this problem, but here we’ll use ngrok to set up a publically accessible tunnel to localhost, which will be visible inside the docker container:

$ ngrok http 8000
Forwarding                    http://c3b3d3d435f1.ngrok.io -> localhost:8000
Forwarding                    https://c3b3d3d435f1.ngrok.io -> localhost:8000

Now we can run our reporting engine with:

$ BROWSERLESS_URL=ws://localhost:3100?token=2cbc5771-38f2-4dcf-8774-50ad51a971b8 \
    REPORTING_BASE_URL=https://c3b3d3d435f1.ngrok.io
    node ./dist/bin/server.js

And now if we try to hit up our endpoint, we should once again see a screenshot of our app!

Hopefully this was helpful! If you try this out, or have a similar (or better!) solution, please let us know!

jason.jpg
Jason Walton

Software designer from Ottawa, Ontario