Breaking down Node.js and WebSockets to understand the use of those in real-time applications

social-bg-7.png

In December 2019, I sadly dislocated my shoulder.

During the recovery process, as part of the therapy, my doctor recommended some exercises, but those were extremely boring. Subsequently, I decided to make the recovery fun and interactive.

Based on that, I took those exercises and mixed them with hardware and software to create a rehabilitation game. In order to make therapy fun, the application needed to transfer data in real-time to give the user a comfortable interaction with the game. A lot of references recommend Node.js and WebSocket for that, and many questions arose.

Why are these tools useful? What characteristics do they have? In which scenario is it recommended to be used? How do they help us in the process of building real-time applications?

The following article takes each tool, identifies the main characteristics, and explains how each concept works. Then, as a conclusion, you can learn how these pieces work together and highlight important considerations of their use.

Node.js

Node.js is a JavaScript runtime environment designed to build scalable network applications. Most of the time we used to hear the facts that Node is non-blocking, single-threaded, asynchronous, concurrent... But, what does it really mean in this context? Let’s discover it.

What is single-thread?

Node.js is single-threaded. One thread means one call stack, one call stack means one thing at a time. Nodejs call stack

Hmm, that's suspicious. If we want things in real-time, doing only one thing at a time does not sound like a good approach, right? Let's find out!

What is the call stack?

The call stack is a data structure that follows the LIFO approach (“Last In, First Out”), so basically what it does is push and pop different instructions read from the code. It is a very important piece of Node.js because it will store the execution order of the program and we should take special consideration with it because as there is only one if it is busy our application will be busy. Call stack nodejs

What is non-blocking?

Non-blocking is one of those concepts which are easier to understand if we understand first their opposite. Having said that, the concept of blocking refers to all those instructions that block the execution of others until that instruction finishes. Otherwise, non-blocking are instructions that can be done without blocking any other instruction.

Taken into consideration that our goal is to build applications that transfer data in real-time, the fact that JavaScript is a blocking language can surprise us. Fortunately, we can "make it” non-blocking by introducing the concepts of asynchronous and event loop.

What is asynchronous?

In JavaScript, asynchronous is the relation between now and later. But later does not mean after now. It means at one point in the future, a moment that you would not necessarily know when it will be.

Because of its definition and remembering what single-thread means if we don’t have certainty about when our asynchronous task will be done, having those into the call stack will slow down the performance of the application. For that reason, those are executed in the form of “promises” or “callbacks” out of the call stack by default.

When the asynchronous instructions are resolved, they need to return to the call stack, and for that, they follow a route that visits two other processes before: first the callback queue, and then the event loop.

What is the callback queue?

The callback queue is a “first in first out” (FIFO) data structure type that receives and stores momentarily, resolved asynchronous code and because of that, it is a crucial support to the event loop.

What is the event loop?

The event loop is a very simple but important part of how Node.js works. It is a continuously looping process where the main goal is to pass an instruction to the call stack from the callback queue but only if the call stack is empty. Event Loop in nodejs

What is concurrency?

Concurrency is when two or more processes run together but not at the same time, it means that the execution “jumps” between processes. Even when Node.js is single-threaded, it supports thousands of concurrent connections with a single server.

This happens because Node.js offloads I/O operations to the system kernel whenever possible and most of the modern operating systems’ kernels are multi-threaded. This way, Node.js can handle one JavaScript operation at a time but a lot of I/O operations at the same time.

Putting pieces together

Our applications can have tons of instructions, some of them could be “fast” to execute such as it is an assign (assigning a value to a variable) but there are others that can be kind of “slow” e.g. network requests.

Regarding the facts that Node.js is single-thread and JavaScript is a blocking language we could assume that if our application needs to make a lot of “slow” instructions, the call stack will be busy and the performance of the app will decrease, but there is where the relevance of Node.js lies because it brings a different approach to face that scenario based on its other features.

Taking into consideration the concepts of non-blocking and asynchronous, the “slow” instructions can be executed as promises or callbacks, and this way those will be taken off from the call stack and delegate the execution of them to “someone” with the capacity to deal with it.

Once the “slow” instruction is resolved, it will return to the call stack by the functionality of the callback queue and the event loop. Thus, the “slow” tasks can be done without saturating the call stack and without affecting the performance of the app. Nodejs cycle

Having described the principal parts of Node.js and how those work together, let’s do the same now with the WebSocket protocol.

WebSocket

WebSockets came to facilitate the process of building applications in real-time because it is a protocol designed to establish persistent bidirectional communication between a client and a server by a TCP connection.

Its functionality allows both parties to have a persistent “conversation” by first establishing a connection with an initial “handshake” and then by sending bidirectionally the packages to exchange. Websocket

Let’s detail other pieces related to this protocol such as its API, lifecycle, scalability, and more.

What is a protocol?

A protocol is a set of syntaxes, rules, and semantics that allow two or more entities or machines to transmit information. Communication worldwide will not be possible without those standards. Some of them are TCP, MQTT, HTTPS, and of course WebSocket.

What is real-time data transfer?

Real-time data transfer consists of basically delivered data immediately after collection without delays in the transmissions of them. It's very used nowadays in a lot of applications such as chats, navigation, gaming, tracking, and so on. By the persistent and bidirectional WebSocket’s connection, the exchange of packages in a few steps is possible, which facilitates the real-time data transfer between applications.

WebSockets’ connection lifecycle

The lifecycle of a WebSockets’ connection can be divided into 3 phases, the first step starts with requesting the establishment of a connection. Then, after the connection was established, the protocol is able to transfer packages bidirectional between sides. And at last, when for some reason there is not required to exchange more data, the connection needs to be closed. WebSockets connection lifecycle

Establish the connection

Before starting to exchange data is needed to establish the connection, it is known as an “initial handshake”. It consists of sending a regular HTTP connection with an “upgrade” header from the client to the server, which indicates the solicitude to change the protocol used. The server received it and in case it supports the WebSocket protocol, it agrees to the upgrade and communicates this through a header response.

After that, the handshake is done and the initial HTTP connection is replaced by a WebSocket connection that uses the same initial TCP/IP connection.

Use of the connection

At this time both sides are allowed to send and receive data by the WebSocket protocol. Packages can be exchanged bidirectionally anytime and are in this phase when more of the events and methods that the WebSocket API brings can be used in a practical way.

Close connection

In order to make smart use of resources, it is important to close the connection whenever it is not used anymore, for that, both sides have equal rights to close the connection, all it has to do is to send the request using the “close” method with two optional params who indicate, respectively, the closing code and the reason of the action.

WebSocket API

This protocol offers a simple but useful interface to interact with it. There are a total of two methods and four events.

  • Send (method): send data
  • Close (method): close the connection
  • Open (event): connection established. Available via the onopen property.
  • Message (event): data received. Available via the onmessage property.
  • Error (event): WebSocket error. Available via the onerror property.
  • Close (event): connection closed. Available via the onclose property.

Then, the API offers other attributes such as binaryType, readyState, or bufferedAmount that allow us to make custom logic implementations as for example a rate limit to prevent DoS attacks to the server. Also, there are many WebSocket libraries that facilitate this and other high-level mechanisms.

Encrypted WebSocket

Using the ws URL schema it is possible to establish a WebSocket connection.

let socket = new WebSocket("ws://nodesource.com");

But it is also possible using the ‘wss’ URL schema too.

let socket = new WebSocket("wss://nodesource.com");

The main difference between them is that there is an ‘s’ of more in the second URL schema but that change has more implications than just the addition of a letter. That “s” stands for “secure” which means that this connection is encrypted using WebSocket over SSL/TLS.

As an analogy, WSS is to WS the same as HTTPS is to HTTP because HTTPS is an encrypted version of HTTP. For security reasons, it is highly recommended to use the encrypted way in both protocols.

Scalability

Scalability is a crucial consideration to have in mind in the design process of an application because otherwise, we can face non-beneficent scenarios when the moment of growth comes.

In order to increase the capacity and functionalities of the app when it requires there are two approaches to apply: vertical scalability and horizontal scalability.

Vertical scalability consists of adding more resources to the server, as for example more RAM, better CPU, or so on. It is a fast way to grow but it has limitations because we can’t do that infinitely.

Horizontal scalability is about adding more instances of the server. It will require more configuration but it is a solution that can be implemented the number of times required. However, regarding the main characteristic of the WebSocket, which is the persistent bidirectional connection, we have that it is generated an important situation when we try to scale horizontally.

This happens because when the socket connection is established, it is bound to specific instances of a client and a server, and if we increase the number of instances in the backend, there is a possibility that the client requests a server's instance that has no idea about that connection established. Scalability Websockets

Anyway, there are alternatives to implement in order to avoid the situation explained, as for example, the implementation of a load balancer configured with a sticky-session. This way, those implementations will allow the client to request the correct server’s instance. Understanding Websockets

Conclusions

Node.js

Node.js is a relevant tool to develop real-time applications because, among others, it has the particularity to execute each instruction according to its characteristics and needs. It means that it can either execute tasks really fast on the call stack using synchronous code or delegate to someone else when it requires more processing power through the use of asynchronous code. This combination makes smart use of the resources and keeps in good shape the performance of the application.

We could get confused with some characteristics that at the beginning don't seem to be from a proficient tool for the development of real-time applications, such as the fact that JavaScript is a blocking language or that Node.js is single-threaded. But when we see the whole picture, we find other characteristics (for example, the asynchronous code and the event loop) that allow us to understand how Node.js works and how it takes advantage of its strengths and weaknesses.

It's important to understand too that because of all those characteristics we already mentioned before, Node.js is an excellent option not only for real-time applications but for applications that need to handle multiple I/O requests in general. Same way, it does not represent the best approach if it is required to deal with intensive CPU computing.

WebSocket

WebSocket is a protocol that establishes persistent bidirectional communication between parties. It provides a mechanism for browser-based applications that need two-way communication with servers that do not rely on opening multiple HTTP connections, which makes it a very useful tool in the development of real-time applications.

It presents an API that simplifies the interaction with the protocol but it does not include some common high-level mechanisms such as reconnection, authentication, or many others. However, it is possible to implement them manually or through a library. Also, the protocol offers an encrypted option for the exchange of packages and it is composed of a three-phase life cycle connection.

On other hand, it's important to have in mind that the horizontal scalability of an app with WebSockets will require extra steps, including adding more pieces to the architecture of the app and implementing configurations. Those extra steps are important to have in consideration in the design process of the app because, otherwise, it could represent non-beneficent scenarios in the future.

Finally, regarding my use case, the rehabilitation game consists of taking human movement and control in this way by running dinosaur game from Chrome. In that context, the WebSocket’s open connection allows exchanging, between the backend and frontend, the movements translated in data without the necessity to open a new HTTPS connection each time a movement is detected. About Node.js, its characteristics help the fast processing of the instructions which allows the real-time communication.

6YKraX13.jpg
Mauro López

Adventurer, maker, techno utopian. He/him.