Your latest Node.js content, news and updates in one place.

8-min.jpg
11 Simple npm Tricks That Will Knock Your Wombat Socks Off

This blog post was first published on March 2017. Find out more here

Using npm effectively can be difficult. There are a ton of features built-in, and it can be a daunting task to try to approach learning them.

Personally, even learning and using just one of these tricks (npm prune, which is #4) saved me from getting rid of unused modules manually by deleting node_modules and re-installing everything with npm install. As you can probably imagine, that was insanely stressful.

We've compiled this list of 11 simple-to-use npm tricks that will allow you to speed up development using npm, no matter what project you're working on.

1. Open a package’s homepage

Run: npm home $package

Running the home command will open the homepage of the package you're running it against. Running against the lodash package will bring you to the Lodash website. This command can run without needing to have the package installed globally on your machine or within the current project.

2. Open package’s GitHub repo

Run: npm repo $package

Similar to home, the repo command will open the GitHub repository of the package you're running it against. Running against the express package will bring you to the official Express repo. Also like home, you don’t need to have the package installed.

3. Check a package for outdated dependencies

Run: npm outdated

You can run the outdated command within a project, and it will check the npm registry to see if any of your packages are outdated. It will print out a list in your command line of the current version, the wanted version, and the latest version.

Running npm outdated on a Node project

4. Check for packages not declared in package.json

Run: npm prune

When you run prune, the npm CLI will run through your package.json and compare it to your project’s /node_modules directory. It will print a list of modules that aren’t in your package.json.

The npm prune command then strips out those packages, and removes any you haven't manually added to package.json or that were npm installed without using the --save flag.

Running npm prune on a Node project

Update: Thanks to @EvanHahn for noticing a personal config setting that made npm prune provide a slightly different result than the default npm would provide!

5. Lock down your dependencies versions

Run: npm shrinkwrap

Using shrinkwrap in your project generates an npm-shrinkwrap.json file. This allows you to pin the dependencies of your project to the specific version you’re currently using within your node_modules directory. When you run npm install and there is a npm-shrinkwrap.json present, it will override the listed dependencies and any semver ranges in package.json.

If you need verified consistency across package.json, npm-shrinkwrap.json and node_modules for your project, you should consider using npm-shrinkwrap.

Running npm shrinkwrap on a Node project

6. Use npm v3 with Node.js v4 LTS

Run: npm install -g npm@3

Installing npm@3 globally with npm will update your npm v2 to npm v3, including on the Node.js v4 LTS release (“Argon”) ships with the npm v2 LTS release. This will install the latest stable release of npm v3 within your v4 LTS runtime.

7. Allow npm install -g without needing sudo

Run: npm config set prefix $dir

After running the command, where $dir is the directory you want npm to install your global modules to, you won’t need to use sudo to install modules globally anymore. The directory you use in the command becomes your global bin directory.

The only caveat: you will need to make sure you adjust your user permissions for that directory with chown -R $USER $dir and you add $dir/bin to your PATH.

8. Change the default save prefix for all your projects

Run: npm config set save-prefix="~"

The tilde (~) is more conservative than what npm defaults to, the caret (^), when installing a new package with the --save or --save-dev flags. The tilde pins the dependency to the minor version, allowing patch releases to be installed with npm update. The caret pins the dependency to the major version, allowing minor releases to be installed with npm update.

9. Strip your project's devDependencies for a production environment

When your project is ready for production, make sure you install your packages with the added --production flag. The --production flag installs your dependencies, ignoring your devDependencies. This ensures that your development tooling and packages won’t go into the production environment.

Additionally, you can set your NODE_ENV environment variable to production to ensure that your project’s devDependencies are never installed.

10. Be careful when using .npmignore

If you haven't been using .npmignore, it defaults to .gitignore with a few additional sane defaults.

What many don't realize that once you add a .npmignore file to your project the .gitignore rules are (ironically) ignored. The result is you will need to audit the two ignore files in sync to prevent sensitive leaks when publishing.

11. Automate npm init with defaults

When you run npm init in a new project, you’re able to go through and set up your package.json’s details. If you want to set defaults that npm init will always use, you can use the config set command, with some extra arguments:

npm config set init.author.name $name
npm config set init.author.email $email

If, instead, you want to completely customize your init script, you can point to a self-made default init script by running

npm config set init-module ~/.npm-init.js`

Here’s a sample script that prompts for private settings and creates a GitHub repo if you want. Make sure you change the default GitHub username (YOUR_GITHUB_USERNAME) as the fallback for the GitHub username environment variable.

var cp = require('child_process');
var priv;

var USER = process.env.GITHUB_USERNAME || 'YOUR_GITHUB_USERNAME';

module.exports = {

  name: prompt('name', basename || package.name),

  version: '0.0.1',

  private: prompt('private', 'true', function(val){
    return priv = (typeof val === 'boolean') ? val : !!val.match('true')
  }),

  create: prompt('create github repo', 'yes', function(val){
    val = val.indexOf('y') !== -1 ? true : false;

    if(val){
      console.log('enter github password:');
      cp.execSync("curl -u '"+USER+"' https://api.github.com/user/repos -d " +
        "'{\"name\": \""+basename+"\", \"private\": "+ ((priv) ? 'true' : 'false')  +"}' ");
      cp.execSync('git remote add origin '+ 'https://github.com/'+USER+'/' + basename + '.git');
    }

    return undefined;
  }),

  main: prompt('entry point', 'index.js'),

  repository: {
    type: 'git',
    url: 'git://github.com/'+USER+'/' + basename + '.git' },

  bugs: { url: 'https://github.com/'+USER'/' + basename + '/issues' },

  homepage: "https://github.com/"+USER+"/" + basename,

  keywords: prompt(function (s) { return s.split(/\s+/) }),

  license: 'MIT',

  cleanup: function(cb){

    cb(null, undefined)
  }

}

# One last thing...

If you want to learn more about npm, Node.js, JavaScript, Docker, Kubernetes, Electron, and tons more, you should follow @NodeSource on Twitter. We're always around and would love to hear from you!

Read More
9-min.jpg
Containerizing Node.js Applications with Docker

Application containers have emerged as a powerful tool in modern software development. Lighter and more resource efficient than traditional virtual machines, containers offer IT organizations new opportunities in version control, deployment, scaling, and security.

This post will address what exactly containers are, why they are proving to be so advantageous, how people are using them, and best practices for containerizing your Node.js applications with Docker.

What’s a Container?

Put simply, containers are running instances of container images. Images are layered alternatives to virtual machine disks that allow applications to be abstracted from the environment in which they are actually being run. Container images are executable, isolated software with access to the host's resources, network, and filesystem. These images are created with their own system tools, libraries, code, runtime, and associated dependencies hardcoded. This allows for containers to be spun up irrespective of the surrounding environment. This everything-it-needs approach helps silo application concerns, providing improved systems security and a tighter scope for debugging.

Unlike traditional virtual machines, container images give each of its instances shared access to the host operating system through a container runtime. This shared access to the host OS resources enables performance and resource efficiencies not found in other virtualization methods.

VMs-Containers-light

Imagine a container image that requires 500 mb. In a containerized environment, this 500 mb can be shared between hundreds of containers assuming they are are all running the same base image. VMs, on the other hand, would need that 500 mb per virtual machine. This makes containers much more suitable for horizontal scaling and resource-restricted environments.

Why Application Containers?

The lightweight and reproducible nature of containers have made them an increasingly favored option for organizations looking to develop software applications that are scalable, highly available, and version controlled.

Containers offer several key advantages to developers:

  • Lightweight and Resource Efficient. Compared to VMs, which generate copies of their host operating system for each application or process, containers have significantly less of an impact on memory, CPU usage, and disk space.
  • Immutable. Containers are generated from a single source of truth, an image. If changes are committed to an image, a new image is made. This makes container image changes easy to track, and deployment rollbacks intuitive. The reproducibility and stability of containers helps development teams avoid configuration drift, making things like version testing and mirroring development and production environments much simpler.
  • Portable. The isolated and self-reliant nature of containers makes them a great fit for applications that need to operate across a host of services, platforms, and environments. They can run on Linux, Windows, and macOS. Provide them from the cloud, on premise, or wherever your infrastructure dictates.
  • Scalable and Highly Available. Containers are easily reproducible and can be made to dynamically respond to traffic demands, with orchestration services such as Azure Container Instances, Google Cloud Engine, and Amazon ECS making it simpler than ever to generate or remove containers from your infrastructure.

Application Container Use Cases

Not all applications and organizations are going to have the same infrastructure requirements. The aforementioned benefits of containers make them particularly adept at addressing the following needs:

DevOps Organizations

For teams working to practice ‘infrastructure as code’ and seeking to embrace the DevOps paradigm, containers offer unparalleled opportunities. Their portability, resistance to configuration drift, and quick boot time make containers an excellent tool for quickly and reproducibly testing different code environments, regardless of machine or location.

Microservice and Distributed Architectures

A common phrase in microservice development is “do one thing and do it well,” and this aligns tightly with application containers. Containers offer a great way to wrap microservices and isolate them from the wider application environment. This is very useful when wanting to update specific (micro-)services of an application suite without updating the whole application.

A/B testing

Containers make it easy to roll out multiple versions of the same application. When coupled with incremental rollouts, containers can keep your application in a dynamic, responsive state to testing. Want to test a new performance feature? Spin up a new container, add some updates, route 1% of traffic to it, and collect user and performance feedback. As the changes stabilize and your team decides to apply it to the application at large, containers can make this transition smooth and efficient.

Containers and Node.js

Because of application containers suitability for focused application environments, Node.js is arguably the best runtime for containerization.

  • Explicit Dependencies. Containerized Node.js applications can lock down dependency trees, and maintain stable package.json, package-lock.json, or npm-shrinkwrap.json files.
  • Fast Boot and Restart. Containers are lightweight and boot quickly, making them a strategic pair for Node.js applications. One of the most lauded features of Node.js is its impressive startup time. This robust boot performance gets terminated processes restarted quickly and applications stabilized; containerization provides a scalable solution to maintaining this performance.
  • Scaling at the Process Level. Similar to the Node.js best practice of spinning up more processes instead of more threads, a containerized environment will scale up the number of processes by increasing the number of containers. This horizontal scaling creates redundancy and helps keep applications highly available, without the significant resource cost of a new VM per process.

Dockerizing Your Node.js Application

Docker Overview

Docker is a layered filesystem for shipping images, and allows organizations to abstract their applications away from their infrastructure.

With Docker, images are generated via a Dockerfile. This file provides configurations and commands for programmatically generating images.

Each Docker command in a Dockerfile adds a ‘layer’. The more layers, the larger the resulting container.

Here is a simple Dockerfile example:

1    FROM node:8
2 
3    WORKDIR /home/nodejs/app
4
5    COPY . .
6    RUN npm install --production
7
8    CMD [“node”, “index.js”]

The FROM command designates the base image that will be used; in this case, it is the image for Node.js 8 LTS release line.

The RUN command takes bash commands as its arguments. In Line 2 we are creating a directory to place the Node.js application. Line 3 lets Docker know that the working directory for every command after line 3 is going to be the application directory. Line 5 copies everything the current directory into the current directory of the image, which is /home/nodejs/app previously set by the WORKDIR command in like 3. On Line 6, we are setting up the production install.

Finally, on line 8, we pass Docker a command and argument to run the Node.js app inside the container.

The above example provides a basic, but ultimately problematic, Dockerfile.

In the next section we will look at some Dockerfile best practices for running Node.js in production.

Dockerfile Best Practices

Don’t Run the Application as root

Make sure the application running inside the Docker container is not being run as root.

1   FROM node:8
2
3   RUN groupadd -r nodejs && useradd -m -r -g -s /bin/bash nodejs nodejs
4
5   USER nodejs
6
7   ...

In the above example, a few lines of code have been added to the original Dockerfile example to pull down the image of the latest LTS version of Node.js, as well as add and set a new user, nodejs. This way, in the event that a vulnerability in the application is exploited, and someone manages to get into the container at the system level, at best they are user nodejs which does not have root permissions, and does not exist on the host.

Cache node_modules

Docker builds each line of a Dockerfile individually. This forms the 'layers' of the Docker image. As an image is built, Docker caches each layer.

7   ...
8     WORKDIR /home/nodejs/app
9
10  COPY package.json .
12  RUN npm install --production
13  COPY . . 
14
15  CMD [“node.js”, “index.js”]
16  ...

On line 10 of the above Dockerfile, the package.json file is being copied to the working directory established on line 8. After the npm install on line 12, line 13 copies the entire current directory into the working directory (the image).

If no changes are made to your package.json, Docker won’t rebuild the npm install image layer, which can dramatically improve build times.

Setup Your Environment

It’s important to explicitly set any environmental variables that your Node.js application will be expecting to remain constant throughout the container lifecycle.

12  ...
13  COPY . .
14  
15  ENV  NODE_ENV production
16
17  CMD [“node.js”, “index.js”]
18

With aims of comprehensive image and container services, DockerHub “provides a centralized resource for container image discovery, distribution and change management, user and team collaboration, and workflow automation throughout the development pipeline.”

To link the Docker CLI to your DockerHub account, use docker login:

docker login [OPTIONS] [SERVER]

Private GitHub Accounts and npm Modules

Docker runs its builds inside of a sandbox, and this sandbox environment doesn’t have access to information like ssh keys or npm credentials. To bypass this constraint, there are a couple recommended options available to developers:

  • Store keys and credentials on the CI/CD system. The security concerns of having sensitive credentials inside of the Docker build can be avoided entirely by never putting them in there in the first place. Instead, store them on and retrieve them from your infrastructure’s CI/CD system, and manually copy private dependencies into the image.
  • Use an internal npm server. Using a tool like Verdaccio, setup an npm proxy that keeps the flow of internal modules and credentials private.

Be Explicit with Tags

Tags help differentiate between different versions of images. Tags can be used to identify builds, teams that are working on the image, and literally any other designation that is useful to an organization for managing development of and around images. If no tag is explicitly added, Docker will assign a default tag of latest after running docker build. As a tag, latestis okay in development, but can be very problematic in staging and production environments.

To avoid the problems around latest, be explicit with your build tags. Here is an example script assigning tags with environment variables for the build’s git sha, branch name, and build number, all three of which can be very useful in versioning, debugging, and deployment management:

1   # !/bin/sh
2   docker tag helloworld:latest yourorg/helloworld:$SHA1
3   docker tag helloworld:latest yourorg/helloworld:$BRANCH_NAME
4   docker tag helloworld:latest yourorg/build_$BUILD_NUM
5

Read more on tagging here.

Containers and Process Management

Containers are designed to be lightweight and map well at the process level, which helps keep process management simple: if the process exits, the container exits. However, this 1:1 mapping is an idealization that is not always maintained in practice.

As Docker containers do not come with a process manager included, add a tool for simple process management.

dumb-init from Yelp is a simple, lightweight process supervisor and init system designed to run as PID 1 inside container environments. This PID 1 designation to the dumb-init process is normally assigned to a running Linux container, and has its own kernel-signaling idiosyncrasies that complicate process management. dumb-init provides a level of abstraction that allows it to act as a signal proxy, ensuring expected process behavior.

What to Include in Your Application Containers

A principal advantage of containers is that they provide only what is needed. Keep this in mind when adding layers to your images.

Here is a checklist for what to include when building container images:

  • Your application code and its dependencies.
  • Necessary environment variables.
  • A simple signal proxy for process management, like dumb-init.

That’s it.

Conclusion

Containers are a modern virtualization solution best-suited for infrastructures that call for efficient resource sharing, fast startup times, and rapid scaling.

Application containers are being used by DevOps organizations working to implement “infrastructure as code,” teams developing microservices and relying on distributed architectures, and QA groups leveraging strategies like A/B testing and incremental rollouts in production.

Just as the recommended approach for single-threaded Node.js is 1 process: 1 application, best practice for application containers is 1 process: 1 container. This mirrored relationship arguably makes Node.js the most suitable runtime for container development.

Docker is an open platform for developing, shipping, and running containerized applications. Docker enables you to separate your applications from your infrastructure so you can deliver software quickly. When using Docker with Node.js, keep in mind:

  • Don’t run the application as root
  • Cache node_modules
  • Use your CI/CD system or an internal server to keep sensitive credentials out of the container image
  • Be explicit with build tags
  • Keep containers light!

One Last Thing

If you’re interested in deploying Node.js applications within Docker containers, you may be interested in N|Solid. We work to make sure Docker is a first-class citizen for enterprise users of Node.js who need insight and assurance for their Node.js deployments.

Get unparalleled visibility into application performance and system health. Create your free NodeSource account

Deploying N|Solid with Docker is as simple as changing your FROM statement! If you’d like to tune into the world of Node.js, Docker, Kubernetes, and large-scale Node.js deployments, be sure to follow us at @NodeSource on Twitter.

Read More
11-min.png
Using Node.js to Create Powerful, Beautiful, User-Friendly CLIs

Not every Node.js application is meant to live in the web; Node.js is a popular runtime allowing you to write multiple types of applications running on a variety of platforms—from the cloud to many IoT devices. Naturally, Node.js can also run in your local shell, where powerful tools can perform magic, executing useful tasks enhancing your developer capabilities.

A Command Line Interface (CLI), can perform anything from a simple operation—like printing ASCII art in the terminal like yosay—to entirely generating the code for a project based on your choices using multiple templates like Yeoman yo. These programs can be installed globally from npm, or executed directly using npx if they are simple enough.

Let's explore the basics of building a simple CLI using Node.js. In this example, we’re creating a simple command which receives a name as an argument displaying an emoji and a greeting.

The first thing you should do as in every application is to create a folder for it and execute:

$ npm init

The previous command will ask for some information like the package name, version, license, and others creating the package.json at the end, looking like this:

{
  "name": "hello-emoji",
  "version": "1.0.0",
  "description": "A hello world CLI with a nice emoji",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "edsadr",
  "license": "MIT"
}

As we want our CLI to be available as a command, we need to configure our package.json to treat our program like that, to do it we add a bin section like this:

"bin": {
  "hello-emoji": "./index.js"
}

In this case, hello-emoji is the command we are registering to execute our program, and ./index.js is the file to be executed when the command is invoked.

To display emojis, let's add a package:

$ npm install node-emoji -S

Now, let's create the file to be executed, index.js:

#!/usr/bin/env node
'use strict'

const emojis = require('node-emoji')

if (!process.argv[2]) {
  console.error(`${emojis.get('no_entry')} Please add your name to say hello`)
  process.exit(1)
}

console.log(`${emojis.random().emoji}  Hello ${process.argv['2']}!`)

Note that we add #!/usr/bin/env node at the top. This tells the system what interpreter to pass that file to for execution; in our case the interpreter is Node.js. After that, the code is fairly straightforward. It requires the node-emoji module and validates process.argv[2], which is the first argument placed by the user. By default process.argv[0] is the path for Node.js binary, and process.argv[1] is the path for the script being executed.

After adding this code, our command is ready to be executed; you can get a 'Hello world!' in your console by running:

$ node index.js world

If you want to run it using the command specified at the bin section of our package.json, you’ll need to install the package globally from npm. But just for development purposes to run it locally we can use:

$ npm link

After executing this command, you can try to execute:

$  hello-emoji world

Arguments parsing

After examining the code we just wrote, you’ll likely realize that the main issue when writing this kind of application is to control the user's input parsing the arguments included in the command execution. Fortunately, the npm ecosystem offers plenty of choices to solve this problem.

Here are some modules helping you to parse user-entered arguments. Some even provide some guidelines to structure your CLI's code:

These packages allow you to create a CLI supporting multiple operations and include parameters; you could efficiently structure something for our CLI to do things like:

$ hello-emoji --name=world --json

Printing a JSON object with our greeting

$ hello-emoji --name=world --emoji=coffee

Instead of a random emoji, this one prints the coffee emoji

Here is an example implementing minimist to do the parsing to execute commands like the ones above:

#!/usr/bin/env node

'use strict'

const emojis = require('node-emoji')
const minimist = require('minimist')
const opts = minimist(process.argv.slice(2))

const emoji = opts.emoji ? emojis.get(opts.emoji) : emojis.random().emoji

if (!opts.name) {
  console.error(`${emojis.get('no_entry')} Please add your name to say hello using the '--name=' parameter`)
  process.exit(1)
}

if (!emojis.has(opts.emoji)) {
  console.error(`${opts.emoji} is not a valid emoji, please check https://www.webfx.com/tools/emoji-cheat-sheet/`)
  process.exit(1)
}

const greeting = `${emoji}  Hello ${opts.name}!`

if (opts.json) {
  console.log(JSON.stringify({greeting}))
} else {
  console.log(greeting)
}

Going interactive

So far, we have been working with information coming from the command execution. However, there is also another way to help make your CLI more interactive and request information at execution time. These modules can help to create a better experience for the user:

With a package like the ones above, you could ask the user directly to input the desired information in many different styles. The example below is using inquirer to ask the users for the name if it was not included as an argument. It also validates the emoji and requests a new one if the input is not valid.

#!/usr/bin/env node

'use strict'

const emojis = require('node-emoji')
const inquirer = require('inquirer')
const minimist = require('minimist')
const opts = minimist(process.argv.slice(2))

let emoji = opts.emoji ? emojis.get(opts.emoji) : emojis.random().emoji

async function main () {
  if (!opts.name) {
    const askName = await inquirer.prompt([{
    type: 'input',
    name: 'name',
    message: `Please tell us your name: `,
    default: () => 'world',
    validate: (answer) => answer.length >= 2
    }])

    opts.name = askName.name
  }

  if (opts.emoji && !emojis.hasEmoji(opts.emoji)) {
    console.error(`${opts.emoji} is not a valid emoji, please check https://www.webfx.com/tools/emoji-cheat-sheet/`)
    const askEmoji = await inquirer.prompt([{
    type: 'input',
    name: 'emoji',
    message: `Please input a valid emoji: `,
    default: () => 'earth_americas',
    validate: (emoji) => emojis.hasEmoji(emoji)
    }])

    emoji = emojis.get(askEmoji.emoji)
  }

  const greeting = `${emoji}  Hello ${opts.name}!`

  if (opts.json) {
    console.log(JSON.stringify({
    greeting
    }))
  } else {
    console.log(greeting)
  }
}

main()

Adding some Eye Candy

Even if the interface for this kind of application is reduced to what you can have in a shell, it does not mean that the UI should look bad. There are plenty of tools that can help make your apps look good; here are some different libraries that will add a nice touch to the look of your CLI output:

  • Chalk or Colors will allow you to set the color of your text.
  • To include images translated to ASCII art, try asciify-image or ascii-art
  • If you have to output much information a well-organized output could be in tables, try Table or Cli-table
  • If your CLI requires processes taking some time, like consuming external APIs, querying databases or even writing files, you can add a cute spinner with Ora or Cli-spinner.

Conclusion

Creating user-friendly, useful and beautiful CLIs is part science and part art. After exploring the basics of creating a CLI tool, you can go and explore a universe of possibilities with the packages available through the npm registry. Hopefully, you’ll soon be creating functional and user-friendly tooling that’s missing from your current inventory thanks to the power of Node.js.

Read More
7-min.jpg
Diagnostics in Node.js Part 3/3

If you haven’t checked out the first two part of our ‘Diagnostics in Node.js’ series, click here and the second part here.

This is a 3-part blog series on Node.js. It is based on Colin Ihrig's talk at JSConf Colombia. The topics are separated by the age of diagnostic techniques - from the oldest to the newest:

  • Part One: Debug Environment Variables, Warnings, Deprecations, Identifying Synchronous I/O and Unhandled Promise Rejections.
  • Part Two: Tick Processor Profiling, The V8 Inspector, CPU Profiling, Heap Snapshots, Asynchronous Stack Traces.
  • Part Three: Tracing, TLS Connection Tracing, Code Coverage, Postmortem Debugging, Diagnostics Reports.

Let’s begin! 🚀

Enterpise Diagnostics Node.js

Tracing

Tracing has been around in Node.js since version 6, but it has gotten more attention over the last few years. The Trace Event provides a mechanism to centralize tracing information generated by V8, Node.js core and userspace code.

By default the node, node.async_hooks, and v8 categories are enabled.

node --trace-event-categories v8,node,node.async_hooks server.js

You can execute --trace-event-enabled to get the output of several events that happened inside of Node.js. This can include accessing the file system, performance data, async hooks, and others. You can configure which events you want to see by using the flag --trace-event-category, allowing users to create custom trace events and use them for example to see how long an operation takes.

In chrome you can open chrome://tracing/, click the record button allowing you to visualize traces like this: 1

If you look at the bottom of the screen you can see fs.sync.read. This is the read operation of the file system. There are 546 bytesRead. It is also possible to see when the tracing started, how long it took, and the CPU Duration, which is all very useful for seeing what’s going on with your code.

TLS Connection Tracing

It is possible to use TLS Connection Tracing in more recent versions of Node.js. You may have experienced the following: You try to connect to a server via https but it doesn’t work. You get redirected to use the OpenSSL command line tool and it gets complicated. Now you can use the--trace-tls flag from the CLI for all TLS connections and you will get a significant amount of debugging information printed to the console every time you try to establish a TLS connection. The flag will work for all the connections in your application and you can establish the connection do it on a peer server or per socket instance.

Code Coverage

Code Coverage is a measurement of how many lines/blocks/arcs of your code are executed while the automated tests are running. In other words, it is measuring how well your test set is covering your source code. i.e. to what extent is the source code covered by the set of test cases.

Code coverage is collected by using a specialized tool to instrument the binaries to add tracing calls and run a full set of automated tests against the instrumented product. A good tool will give you not only the percentage of the code that is executed, but will also allow you to drill into the data and see exactly which lines of code were executed during a particular test.

V8 Code Coverage was the old way of measuring code coverage. It had many problems including the instrumentation of every line of code with counters and new language features lagging behind. Now V8 supports code coverage natively, and Node.js can take advantage of this using the NODE_V8_COVERAGE environment variable. This variable takes a string as its value which will be the name of a newly formed directory where you want to write your coverage information to.

Using coverage built directly into the V8 engine could address many of the shortcomings facing the previous transpilation-based approach to code coverage. The benefits being:

Rather than instrumenting the source-code with counters, V8 adds counters to the bytecode generated from the source-code. This makes it much less likely for the counters to alter your program’s behavior. Counters introduced in the bytecode don’t impact performance as negatively as injecting counters into every line of the source (it’s possible to notice a 20% slowdown in Node.js’ suite vs 300%). As soon as new language features are added to V8, they are immediately available for coverage.

The coverage information that is generated by V8 is a JSON format that is hard to understand if you look it up yourself. However, there are tools like c8 that can help you with this. The following is an example of using c8 with npx.

if (process.argv[2] === 'foo')
   console.log('got the foo arg');
else
   console.log('did not get the foo arg');

2

In this example, process.argv was called with no other arguments in the command line. In this case the output is ‘did not get the foo arg’. C8 will print out a list of all the files and highlights coverage percentiles for all statements, branches, functions, lines and uncovered lines. There are ways that you can get a more detailed view. For instance you can open a file and you can investigate its coverage line by line.

Postmortem Debugging

The shortcomings of traditional debugging tools have led to the rise of a separate class of debugging, referred to as postmortem debugging. This typically consists of capturing a core dump of a process when it crashes, restarting the process, and analyzing the core dump offline. This allows the process to be debugged while keeping the production system running.

Postmortem Debugging is another way to get valuable information out of Node.js. The problem with Postmortem Debugging is that it has a very high barrier of entry, as it is necessary to set up your system to collect core files.

Core files are an exact snapshot of an application when it crashes. They are turned off by default in most operating systems because the core files can get quite large. As such you have to enable it and then run Node with the flag --abort-on-uncaught-exception.

Once you get a core file you can analyze it with llnode which gives you deep insides into stack frames across the javascript and c++ boundaries. This allows you to inspect JavaScript objects to obtain more information about the crash. It is worth noting that most tools don’t give that type of visibility.

Another problem with this approach, is that tools like llnode depend heavily on the internals of V8. As such it tends to break every time node upgrades its version of V8. This problem led to another recent addition to Node which are Diagnostics Reports.

To see examples and more information of this too, read this blog post.

Production Diagnostics

Another way to access diagnostics is NodeSource’s Enterprise Node.js Runtime called NSolid. It solves the challenge of generating diagnostic assets such as CPU Profiles and Heap Snapshots in production, without requiring external packages or instrumentation of your code.

You can simply run your existing Node.js apps on our Node.js Enterprise runtime and NSolid magically exposes performance, diagnostics and security capabilities under the hood with low enough overhead that it can all be done in production.

[INSERT DEMO VIDEO HERE] https://vimeo.com/417916871/0f2767ff9c

Find out more here

Diagnostics Reports

It’s possible to think of Diagnostics Reports as light-weight Postmortem Debugging. We don’t get the same level of detail we can access in a core file, but it has a much lower barrier of entry and is more configurable.

The report does not pinpoint the exact problem or specific fixes, but its content-rich diagnostic data offers vital hints about the issue and accelerates the diagnostic process.

You will be able to generate Diagnostics Reports on a signal such as a crash or an uncaught exception. They are programmatic APIs inside of Node.js allowing you to execute process.report.getReport which generates a JSON object containing data about the system, the node process, libuv information, c++ stack and more.

Diagnostic Reports de this by using a tool called First Failure Data Capture (FFDC). It is designed to instantly collect information about what led to a failure so that users don’t need to re-create the failure.

This Diagnostic Report is generated in a semi man-machine readable format. This means you can read it in its original state if you’re moderately skilled at diagnostics reporting or it can be loaded into a JS program or passed to a monitoring agent. The resulting file contains information about the state of the application and the hosting platform, covering all vital data elements.

This document can improve the overall troubleshooting experience because it: Answers many routine questions which can reduce the number of iterations needed to understand the cause of the failure. It offers a comprehensive view of the state of the application and virtual machine at the time of failure. This information can drastically improve decision making for the next set of data collection, if required. Ideally, the FFDC enables someone to resolve the issue without any additional information!

Diagnostic Reports are still experimental, but because it is not code that is going to really impact your running application it is recommended to use it.

The following command line argument runs Diagnostic Reports:

$ node--experimental-report --diagnostic-report-uncaught-exception w.js

Writing Node.js report to file: report.20190309.102401.47640.001.json
Node.js report completed

The data it captures can be correlated with anomalies like fatal errors that terminate the program, application exceptions, or any other common failure scenarios. The data the tools actually captures are JavaScript heap statistics, native and application call stack, process’ CPU consumption, and more.

There are a hand full of flags that you must use to configure it:

  • --experimental-report => because it is still experimental, this flag will enable Diagnostic Reports.
  • --report-on-fatalerror => If you are interested in collecting information when node crashes in the c++ layer
  • --report-uncaught-exception => If you are interested in JavaScript uncaught exceptions
  • --report-on-signal => if you want to send a specific signal to your process and have it generate this report
  • --report-signal=signal => you can define which signal you want to use, by default it uses sigUser2
  • --report-directory=directory => lets you specify where you want to write these reports to
  • --report-filename=filename => lets you specify the file-name of these reports (by default is the year, date and other stuff)

This is how the report looks like: A big JSON object that contains event, trigger, timestamps, processId, and the commandLine flags you used.

3

4

References:

Easily identify problems in Node.js applications with Diagnostic Report

Rethinking JavaScript Test Coverage

Node.js v14.2.0 Documentation

Easily identify problems in Node.js applications with Diagnostic Report

What is code coverage and how do YOU measure it?

Read More
12.jpg
Diagnostics in Node.js Part 2/3

If you haven’t checked out the first part of Diagnostics in Node.js, click here.

This is a 3-part blog series on Node.js. It is based on Colin Ihrig's talk at JSConf Colombia. The topics are separated by the age of diagnostic techniques - from the oldest to the newest:

  • Part One: Debug Environment Variables, Warnings, Deprecations, Identifying Synchronous I/O and Unhandled Promise Rejections.
  • Part Two: Tick Processor Profiling, The V8 Inspector, CPU Profiling, Heap Snapshots, Asynchronous Stack Traces.
  • Part Three: Tracing, TLS Connection Tracing, Code Coverage, Postmortem Debugging, Diagnostics Reports.

Let’s begin! 🚀

Enterpise Diagnostics Node.js

Tick Processor Profiling

When dealing with web applications, we want to provide the best possible performance to our users. Using a profiler can help you identify bottlenecks leading into your application. This can further reduce the amount of time spent in a request such as accessing a database or waiting for an API call to respond.

One of those profilers is V8’s built-in sample-based profiler. Profiling is turned off by default, but can be enabled via the --prof command-line option, which dumps V8-profiler-output into a file. The sampler records stacks of both JavaScript and C/C++ code.

This is a 2-step process: first, you are able to profile your code as it is running. This will dump a file that is not meant to be consumed by humans: the file is called isolate-0x[numbers]-v8.log . The second step takes that output and formats it in a way that is human readable code. This can be done using the flag --prof-process.

The isolate-0x[numbers]-v8.log file looks like this: 1

Then you can run node --prof-process isolate-0x[numbers]-v8.log and the file will look like this:

i.e. node --prof-process isolate-0xnnnnnnnnnnnn-v8.log > processed.txt 2

There are a lot of things going on here, but what this is basically showing is where you are spending time in shared libraries, JavaScript and C++ code.

3

The first line is saying that the application has used 761 ticks to execute the application. A tick is like a clock cycle used by a node process. So in theory the application took 761 clock cycles to execute. You can also find a summary section breaking down JavaScript vs C++ code.

It should be noted that in the [JavaScript] section you can see something called LazyCompile and ‘*realpathSync’. The asterisk means that V8 was able to optimize your code, so if you don’t see the asterisk there is a chance that your code is not optimized and is taking more time to execute than you realize.

The V8 Inspector

A few years ago, Chrome DevTools was integrated directly into V8, expanding its capabilities to include Node.js applications. With this integration it was possible to access step-debuggers without having to install the node-inspector module.

There are a few ways to get started: one is using the --inspect flag, which will start the inspector. Then, you can pass a host and a port that you want to listen to which you will connect later --inspect[=[host:]port]. If no parameters are passed, it will connect to 127.0.0.1:9229 by default.

One other way is more useful when doing local development, using the --inspect-brk flag . This flag has the same options for host and port than the --inspect flag but also puts a breakpoint before the user-code starts, so you can do any type of setup you prefer without having to try/catch break points in your code at runtime.

14

In the example file, there is this line of code: Promise.reject(new Error('A very cool error here 👾'));

Now calling the file with the --inspect-brk flag 4

We can see the message printed in the console: Debugger is listening on ws: followed by a websocket url. Websocket makes it possible to open a two-way interactive communication session between the user's browser and a server. We can also see a message that directs users to the Node.js documentation so we understand what is happening there.

Then, if we go to the url chrome://inspect or even better about:inspect, you will see something like this:

5

Once you click on the dedicated DevTools for Node.js link, you can see a popup window for debugging your Node.js session.

6

One cool feature is that when you kill and restart node, the window will automatically reconnect to it. 🔁

DevTools is now connected to Node.js, providing you with access to all the Chrome DevTools features you’re used to. This allows you to:

  • Edit pages on-the-fly and diagnose problems quickly, which ultimately helps you build better websites, faster.
  • Complete breakpoint debugging, stepping with blackboxing
  • Access sourcemaps for transpiled code
  • LiveEdit: JavaScript hot-swap evaluation with V8
  • Console evaluation with ES6 feature/object support and custom object formatting
  • Sampling JavaScript profiler with flamegraph
  • Heap snapshot inspection, heap allocation timeline, allocation profiling
  • Asynchronous stacks for native promises

However, the V8 Inspector should never be used in production because DevTools actions halt the event. This is acceptable in development, but unsuitable for production environments. If you are interested in production diagnostics: NodeSource's Node.JS for Enterprise (NSolid) is the only way to access native performance and security metrics and diagnostics that don’t incur latency in production.

The V8 inspector is super useful in development and NSolid in production environment and you should give it a try! 😉

CPU Profiling - in Dev and Prod

CPU Profiling - in Dev only

CPU Profiling allows you to understand where opportunities exist to improve the speed and load capacity of your Node processes.

One common problem inside DevTools is getting your server set up and running and then trying to start a CPU profiling session. This is problematic because when you try to kill your server and you are applying load the profiling may not work properly.

To solve that issue, the --cpu-prof flag was built directly into Node.js. This flag will start the CPU profiler and when the Node.js process exits it will write a CPU profile output to a file.

You can also use the --cpu-prof-dir flag to specify a directory where the file will be saved and --cpu-prof-name to change the name of the file. If you don’t specify those attributes, the file will be saved in your present working directory, and the name will be a combination of the date, PID, TID, sequence and will end with cpuprofile keyword.

CPU.${yyyymmdd}.${hhmmss}.${pid}.${tid}.${seq}.cpuprofile

You can also set the --cpu-prof-interval flag which is how often the sample profiler is going to sample your application. By default this is set to one millisecond. You can also use the DevTools UI to collect profiles by hand.

In other words, the --cpu-prof flag will start the V8 CPU profiler on start up, and write the CPU profile to disk before exit. If --cpu-prof-dir is not specified, the profile will be written to the current working directory with a generated file name.

This is how a cpu profile looks like: 7

The top section is showing you a high-level view about the CPU activity over time. You can select an interva linside and that will show a more detailed breakdown of the activity.

CPU Profiling measures the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization.

CPU Profiling - in Prod only

In a production environment, we recommended using NSolid. It has some benefits over Chrome Dev Tools, including:

  • It’s possible to use it in development and production.
  • There is no computational overhead which means that results are consistent without incurring an observer effect that can skew results.
  • It is a drop and replace version of the Node.js runtime, requiring zero code instrumentation.
  • It doesn't stop the event-loop, and was specifically designed to be useful in production environments.
  • It can be configured to automatically trigger CPU profiles if a process exceeds a certain performance threshold.

For analyzing profiles using the NSolid Console, first you launch the console and select the process that is of interest.

![8](//images.ctfassets.net/hspc7zpa5cvq/1fyIFPU1kplviFtfMdfThS/6b7131d32dc65c12c6d3e8e7f3898a68/8.jpg)

On the process details page, click the New CPU Profile button, then you select your profile window (5 to 60 seconds) and desired visualization style and run profile.

You can choose between three different visualizations: Sunburst Chart, Flame Graph, and Tree Map. The next images is an example of a Sunburst Chart: 9

To find out more about cpu profiling in NSolid visit the docs here

Heap Snapshots - in Dev and Prod

Heap Snapshots - in Dev only

A heap snapshot is a static snapshot of memory-usage-details at a moment in time, and it provides a glimpse into the heap usage of V8, the JavaScript runtime that powers Node.js. By looking at these snapshots, you can begin to understand where and how memory is being used. Heap snapshots are very useful for finding and fixing memory and performance issues in Node.js applications, especially memory leaks.

A few years ago, developers had to use the heap dump module to obtain heap snapshots. Today, we have a built-in heap snapshots signal flag --heapsnapshot-signal so you can send as many signals as you want and Node.js will dump the heap snapshots.

Chrome DevTools allows you to compare snapshots, and you can identify objects in memory that will help you narrow down where a memory leak might be occurring. 10

This is how a heap snapshot looks like in Chrome DevTools at a very high level. The column on the far left lists the objects on the JavaScript heap.

On the far right, you can see: the Objects count column which represents how many objects are in memory, the shallow size column, which is the amount of memory allocated to store the object itself, not taking into account the referenced objects, and theretained size column which is its shallow size plus the shallow sizes of the objects that are accessible, directly or indirectly, only from this object.In other words, the retained size represents the amount of memory that will be freed by the garbage collector when this object is collected.

In this example, we can see that the selected object is holding on to over 3 GB of ram and 3 MB of memory. This object should be reviewed.

Heap Snapshots - in Prod only

The best solution to get heap snapshots in production is the NSolid console. The benefits of NSolid over Chrome DevTools include the possibility to use it in development and production, as well as providing a live instrumentation of your production system's health, and stability with no changes to your application code and no computational overhead.

To use NSolid’s Heap Snapshots, first launch the console and locate the Processes list on the right, choose the process ID of interest and click New Heap Snapshot from the Process Detail view as is shown in the image below. 11

Now that you can see the results of the heap snapshot, navigate through the various objects indexed during the snapshot process.

You can also configure the NSolid Console to automatically take Heap Snapshots when any process exceeds a certain performance threshold (i.e. Memory > 700MB).

Once taken snapshots can be easily compared as is shown in the image below. This is especially useful to compare an application’s heap-snapshot that was taken when a performance problem occurred against itself when it was still running smoothly. 12

Cells on the left-hand snapshot will be colored, reflecting the percentage difference within the row. The redder the cell, the greater the value has increased over the corresponding value in the other snapshot. Greener cells indicate the reverse. This will help you find memory leaks or performance issues more easily, which can help you to identify the underlying problem faster.

You can find more information here.

Asynchronous Stack Traces

Async Stack Traces make debugging async functions easier. These are rich stack traces that not only include the current synchronous part of the stack, but also the asynchronous part.

Normally, when you execute asynchronous operations, the stack trace is not completed because it doesn’t show the asynchronous part. This makes debugging way more difficult, because you can see that there is an error but you don’t know where it originated from.

There is a very popular module called longjohn that is used for this. However, this module comes with a lot of performance overhead, so it is not recommended to use it in production.

Because of that the V8 team added Async Stack Traces as a way to work with async/await code with very low cost. This will show you where the asynchronous operations happen.

As an example, here we have a function called foo, which is executing an asynchronous operation calling the function bar.

Normally, you will only be able to see bar in the stack trace, but with the Async Stack Trace you can now see foo in the DevTools as well. 13

References:

Profiling Node.js Code ( Part 1: Basics )

Debugging Guide - Node.js Docs

The WebSocket API (WebSockets)

Debugging Node.js with Chrome DevTools

Debian-Node

Read More
13-min.jpg
Diagnostics in Node.js Part 1/3

A diagnostic is a practice concerned with determining a particular problem using a combination of data and information.

The same concept can be applied to Node.js.

When there is a bug, diagnostics utilities can help developers identify the root cause of any Node.js application anomaly whether it occurs in development or production.

There are many types of issues a Node.js application can run into. This includes: crashing, slow performance, memory leaks, high CPU usage, unexpected errors, incorrect output, and more. Identifying their root cause is the first step towards fixing them.

While diagnostics in Node.js doesn’t point to the exact problem or specific fixes, it contains very valuable data that hints about the issue and accelerates the diagnostic process.

This is a 3-part blog series on Node.js. It is based on Colin Ihrig's talk at JSConf Colombia. The topics are separated by the age of diagnostic techniques, from the oldest to the newest:

  • Part One: Debug Environment Variables, Warnings, Deprecations, Identifying Synchronous I/O and Unhandled Promise Rejections.
  • Part Two: Tick Processor Profiling, The V8 Inspector, CPU Profiling, Heap Snapshots, Asynchronous Stack Traces.
  • Part Three: Tracing, TLS Connection Tracing, Code Coverage, Postmortem Debugging, Diagnostics Reports.

Let’s begin!

Enterpise Diagnostics Node.js

A Little bit of History:

In the early years of Node.js it used to be very hard to get diagnostic-information. Node.js was built with a “small core” philosophy, meaning that the core of the project was aimed to remain as small as possible.

It was very important that the Node.js core worked properly, and non-essential things like diagnostics were pushed out into the npm ecosystem (since Node.js can still work just fine without diagnostics). This left us with npm modules such as node inspector node-heapdump, longjohn and others. This dynamic slowed the process of incorporating diagnostic tooling into Node.js itself .

As Node.js matured and as more and more enterprises continued to adopt Node.js, the maintainers realized that diagnostic capabilities were a necessity. These needed to be built into the project, so in the last few years a lot of work has been done to make this a reality. Instead of having to npm install and then edit your source code, now you can just have your regular Node.js, pass a few flags and it will work! ✨

Debug Environment Variables

One of the oldest diagnostic mechanisms built into Node.js are Debug Environment Variables. There are two environment variables you can use to print out useful information from Node.js either in the JavaScript layer or in the C++ layer. Those variables are:

  • NODE_DEBUG for JavaScript logging
  • NODE_DEBUG_NATIVE for C++ logging

All you have to do as you start your Node.js process, is to pass a comma separated list of all subsystems that you would like to have extra diagnostic information from.

Let's takeNODE_DEBUG as an example: imagine you have a deeply nested filesystem call and you have forgotten to use a callback. For example, the following example will throw an exception:

const fs = require('fs');

function deeplyNested() {
  fs.readFile('/');
}

deeplyNested();

The stack trace shows only a limited amount of detail about the exception, and it doesn’t include full information on the call site where the exception originated:

fs.js:60
    throw err;  // Forgot a callback but don't know where? Use NODE_DEBUG=fs
      ^

Error: EISDIR: illegal operation on a directory, read
    at Error (native)

Without this helpful comment, many programmers see a trace like this and blame Node.js for the unhelpful error message. But, as the comment points out, NODE_DEBUG=fs can be used to get more information on the fs module. Run this script instead:

NODE_DEBUG=fs node node-debug-example.js

Now you’ll see a more detailed trace that helps debug the issue:

fs.js:53
    throw backtrace;
        ^

Error: EISDIR: illegal operation on a directory, read
    at rethrow (fs.js:48:21)
    at maybeCallback (fs.js:66:42)
    at Object.fs.readFile (fs.js:227:18)
    at deeplyNested (node-debug-example.js:4:6)
    at Object.<anonymous> (node-debug-example.js:7:1)
    at Module._compile (module.js:435:26)
    at Object.Module._extensions..js (module.js:442:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:311:12)
    at Function.Module.runMain (module.js:467:10)

Now with this information, it becomes easier to find the root cause of the problem. The problem was in our code, inside a function on a line 4 that was originally called from line 7. This makes debugging any code that uses core modules much easier, and it includes both the filesystem and network libraries such as Node’s HTTP client and server modules.

Using environment variables is a good way of debugging, without having to modify your code at all.

Handling Warnings

A few years ago, the concept of warnings was introduced into Node.js. A warning is just a message or notice that implies something that could go wrong (eg memory leak, unused variables) or something that might not work in the future (eg deprecation). Node.js logs warnings about potentially risky behaviors.

It is possible to turn the warnings off using the flag --no-warnings but this practice is not recommended. Instead you can redirect all the warning messages into a file with the flag --redirect-warnings=fileName. This is especially useful if you have a lot of warnings and don’t want to see them all in your console.

You can also use the flag --trace-warnings, which will give you the stack trace of where the warning is coming from whenever you encounter a warning.

The following is an example using buffers: 1
This warning is showing something that might not work in the future: a deprecation warning. It issues a recommendation to use another constructor method along with the stack trace of where that warning originated.

Handling Deprecations

Similar to warnings, there is a special class of warnings called Deprecations. These point out deprecated features that are recommended to not to be used in production because they will no longer be supported, which can cause problems.

There is also a flag that you can use to turn Deprecation warnings off; --no-deprecations. This will disable all Deprecation warnings, but is not recommended to use this flag.

The --trace-deprecation flag works similarly to trace warnings, printing a stack trace when deprecated features are used. The --throw-deprecations flag throws an exception if and when deprecated features are used, so instead of issuing a warning it will throw an error. Its use is recommendedin development rather than in production.

Using the same example of Buffer() we can see this: 2

The --throw-deprecation flag shows you where the code is coming from. One cool thing about this is that the stack frames are shown in different colors. In Node.js v.12, the line with the error is in white while the rest of the stack trace is in gray, pointing to the exact part of your code that should be changed.

Identifying Synchronous I/O

One common way to introduce performance problems in your code is by using Synchronous I/O. If you are working on a server-side application, it is possible to have an initialization period when the server starts up but can’t yet listen to the server’s traffic. Once you start serving the request, it is very important to not block the event loop because that could cause the application to crash.

To avoid this, you can use the --trace-sync-io flag, which will show you warnings with stack traces of where you are using synchronous I/O, so you can fix it.

The following intends to provide an example: The file called example.js contains the following line of code: setImmediate(() => require('fs').readFileSync(__filename)).

When running the file using the flag --trace-sync-io we can see this:

3

The example uses readFileSync, to read the file.

If setImmediate was not around it, there wouldn’t be any problem because it will read the file in the first event loop tick. But since setImmediate is being used, the file read is deferred until the next tick and that’s where synchronous I/O is happening. readFileSync not only reads the file, it opens the file, does a stack call, reads the file and then closes it. As such, having synchronous I/O operations should be avoided.

Unhandled Promise Rejections

You might have probably seen a message like this when working with promises: UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch().

A promise is a state representation of an asynchronous operation and can be in one of 3 states:

  • "pending"
  • "fulfilled"
  • or "rejected"

A rejected promise represents an asynchronous operation that failed for some reason and is completed with .reject(). Another reason could be an exception that was thrown in an async executed code and no .catch() did handle the rejection.

A rejected promise is like an exception that bubbles up towards the application entry point and causes the root error handler to produce that output.

Unhandled Promise Rejections is a newer feature that came up in Node.js 12. Not handling promise rejections is an accepted practice in browsers, but in servers it could be problematic because it can cause memory leaks.

To avoid this, you can now use the flag --unhandled-rejections that has 3 modes of operations:

  1. strict mode causes an uncaught exception
  2. warn mode causes a warning
  3. none mode swallows unhandled rejections (default standard)

In this example, Promise.reject is being called and passes a new error object. We use the flag --unhandled-rejections=strict from the command line, and this will throw an uncaught exception. A good reason to use the strict mode is because you can integrate your promises with your existing unhandled rejections workflow if you have one.

Screen Shot 2020-02-17 at 1.49.15 PM

So in conclusion: we learned a little bit about the history of diagnostics in Node.js, why they are important and we analyzed five handy methods of using diagnostics in Node.js. This included useful flags, such as:

Stay tuned for part 2!

References

Testing and Debugging Node Applications

Node.js Docs

Unhandled Promise Rejections in Node.js

Debugging tools and practices in node.js

Read More
23-min.png
Understanding Worker Threads in Node.js

To understand Workers, first, it’s necessary to understand how Node.js is structured.

When a Node.js process is launched, it runs:

  • One process
  • One thread
  • One event loop
  • One JS Engine Instance
  • One Node.js Instance

One process: a process is a global object that can be accessed anywhere and has information about what’s being executed at a time.

One thread: being single-threaded means that only one set of instructions is executed at a time in a given process.

One event loop: this is one of the most important aspects to understand about Node. It’s what allows Node to be asynchronous and have non-blocking I/O, — despite the fact that JavaScript is single-threaded — by offloading operations to the system kernel whenever possible through callbacks, promises and async/await.

One JS Engine Instance: this is a computer program that executes JavaScript code.

One Node.js Instance: the computer program that executes Node.js code.

In other words, Node runs on a single thread, and there is just one process happening at a time in the event loop. One code, one execution, (the code is not executed in parallel). This is very useful because it simplifies how you use JavaScript without worrying about concurrency issues.

The reason it was built with that approach is that JavaScript was initially created for client-side interactions (like web page interactions, or form validation) -- nothing that required the complexity of multithreading.

But, as with all things, there is a downside: if you have CPU-intensive code, like complex calculations in a large dataset taking place in-memory, it can block other processes from being executed. Similarly, If you are making a request to a server that has CPU-intensive code, that code can block the event loop and prevent other requests of being handled.

A function is considered “blocking” if the main event loop must wait until it has finished executing the next command. A “Non-blocking” function will allow the main event loop to continue as soon as it begins and typically alerts the main loop once it has finished by calling a “callback”.

The golden rule: don’t block the event loop, try to keep it running it and pay attention and avoid anything that could block the thread like synchronous network calls or infinite loops.

It’s important to differentiate between CPU operations and I/O (input/output) operations. As mentioned earlier, the code of Node.js is NOT executed in parallel. Only I/O operations are run in parallel, because they are executed asynchronously.

So Worker Threads will not help much with I/O-intensive work because asynchronous I/O operations are more efficient than Workers can be. The main goal of Workers is to improve the performance on CPU-intensive operations not I/O operations.

Some solutions

Furthermore, there are already solutions for CPU intensive operations: multiple processes (like cluster API) that make sure that the CPU is optimally used.

This approach is advantageous because it allows isolation of processes, so if something goes wrong in one process, it doesn’t affect the others. They also have stability and identical APIs. However, this means sacrificing shared memory, and the communication of data must be via JSON.

JavaScript and Node.js will never have threads, this is why:

So, people might think that adding a new module in Node.js core will allow us to create and sync threads, thus solving the problem of CPU-intensive operations.

Well, no, not really. If threads are added, the nature of the language itself will change. It’s not possible to add threads as a new set of available classes or functions. In languages that support multithreading (like Java), keywords such as “synchronized” help to enable multiple threads to sync.

Also, some numeric types are not atomic, meaning that if you don’t synchronize them, you could end up having two threads changing the value of a variable and resulting that after both threads have accessed it, the variable has a few bytes changed by one thread and a few bytes changed by the other thread and thus, not resulting in any valid value. For example, in the simple operation of 0.1 + 0.2 has 17 decimals in JavaScript (the maximum number of decimals).

var x = 0.1 + 0.2; // x will be 0.30000000000000004

But floating point arithmetic is not always 100% accurate. So if not synchronized, one decimal may get changed using Workers, resulting in non-identical numbers.

The best solution:

The best solution for CPU performance is Worker Threads. Browsers have had the concept of Workers for a long time.

Instead of having:

  • One process
  • One thread
  • One event loop
  • One JS Engine Instance
  • One Node.js Instance

Worker threads have:

  • One process
  • Multiple threads
  • One event loop per thread
  • One JS Engine Instance per thread
  • One Node.js Instance per thread

As we can see in the following image:

worker-diagram@2x (1)

The worker_threads module enables the use of threads that execute JavaScript in parallel. To access it:

const worker = require('worker_threads');

Worker Threads have been available since Node.js 10, but are still in the experimental phase.

Get started with low-impact performance monitoring Create your NodeSource Account

What is ideal, is to have multiple Node.js instances inside the same process. With Worker threads, a thread can end at some point and it’s not necessarily the end of the parent process. It’s not a good practice for resources that were allocated by a Worker to hang around when the Worker is gone-- that’s a memory leak, and we don’t want that. We want to embed Node.js into itself, give Node.js the ability to create a new thread and then create a new Node.js instance inside that thread; essentially running independent threads inside the same process.

What makes Worker Threads special:

  • ArrayBuffers to transfer memory from one thread to another
  • SharedArrayBuffer that will be accessible from either thread. It lets you share memory between threads (limited to binary data).
  • Atomics available, it lets you do some processes concurrently, more efficiently and allows you to implement conditions variables in JavaScript
  • MessagePort, used for communicating between different threads. It can be used to transfer structured data, memory regions and other MessagePorts between different Workers.
  • MessageChannel represents an asynchronous, two-way communications channel used for communicating between different threads.
  • WorkerData is used to pass startup data. An arbitrary JavaScript value that contains a clone of the data passed to this thread’s Worker constructor. The data is cloned as if using postMessage()

API

  • const { worker, parentPort } = require(‘worker_threads’) => The worker class represents an independent JavaScript execution thread and the parentPort is an instance of the message port
  • new Worker(filename) or new Worker(code, { eval: true }) => are the two main ways of starting a worker (passing the filename or the code that you want to execute). It’s advisable to use the filename in production.
  • worker.on(‘message’), worker/postMessage(data) => for listening to messages and sending them between the different threads.
  • parentPort.on(‘message’), parentPort.postMessage(data) => Messages sent using parentPort.postMessage() will be available in the parent thread using worker.on('message'), and messages sent from the parent thread using worker.postMessage() will be available in this thread using parentPort.on('message').

EXAMPLE:

const { Worker } = require('worker_threads');

const worker = new Worker(`
const { parentPort } = require('worker_threads');
parentPort.once('message',
    message => parentPort.postMessage({ pong: message }));  
`, { eval: true });
worker.on('message', message => console.log(message));      
worker.postMessage('ping');  
$ node --experimental-worker test.js
{ pong: ‘ping’ }
Example by Anna Henningsen

What this essentially does is create a new thread using a new Worker, the code inside the Worker is listening for a message on parentPort and once it receives the message, it is going to post the message back to the main thread.

You have to use the --experimental-worker because Workers are still experimental.

Another example:

    const {
      Worker, isMainThread, parentPort, workerData
    } = require('worker_threads');

    if (isMainThread) {
      module.exports = function parseJSAsync(script) {
        return new Promise((resolve, reject) => {
          const worker = new Worker(filename, {
            workerData: script
          });
          worker.on('message', resolve);
          worker.on('error', reject);
          worker.on('exit', (code) => {
            if (code !== 0)
              reject(new Error(`Worker stopped with exit code ${code}`));
          });
        });
      };
    } else {
      const { parse } = require('some-js-parsing-library');
      const script = workerData;
      parentPort.postMessage(parse(script));
    }

It requires:

  • Worker: the class that represents an independent JavaScript execution thread.
  • isMainThread: a boolean that is true if the code is not running inside of a Worker thread.
  • parentPort: the MessagePort allowing communication with the parent thread If this thread was spawned as a Worker.
  • workerData: An arbitrary JavaScript value that contains a clone of the data passed to this thread’s Worker constructor.

In actual practice for these kinds of tasks, use a pool of Workers instead. Otherwise, the overhead of creating Workers would likely exceed their benefit.

What is expected for Workers (hopefully):

  • Passing native handles around (e.g. sockets, http request)
  • Deadlock detection. Deadlock is a situation where a set of processes are blocked because each process is holding a resource and waiting for another resource acquired by some other process. Deadlock detention will be useful for Worker threads in this case.
  • More isolation, so if one process is affected, it won’t affect others.

What NOT to expect for Workers:

  • Don’t think Workers make everything magically faster, in some cases is better to use Worker pool
  • Don’t use Workers for parallelizing I/O operations.
  • Don’t think spawning Workers is cheap

Final notes:

The contributors to Workers in Node.js are looking for feedback, if you have used Workers before and want to contribute, you can leave your feedback here

Workers have chrome DevTools support to inspect Workers in Node.js.

And worker_threads is a promising experimental module if you need to do CPU-intensive tasks in your Node.js application. Keep in mind that it’s still experimental, so it is advisable to wait before using it in production. For now, you can use Worker pools instead.

References:

Special thanks to Anna Henningsen and her amazing talk of Node.js: The Road to Workers

Node.js API

Node.js multithreading: What are Worker Threads and why do they matter? - by Alberto Gimeno

Introduction to Javascript Processes - by Nico Valencia

The Node.js Event Loop

Read More
  • 2 / 3