Understanding Memory Leaks in Nodejs

Understanding JavaScript’s Memory Model, with Examples.

Chidume Nnamdi 🔥💻🎵🎮
Bits and Pieces

--

Once we begin to type that code, we already introduce bugs and allocating memory without knowing it. How we manage them can make or mar our software.

In this post, we will learn, using examples, about memory leaks in Nodejs and how we can solve them.

In our effort to understand what a memory leak is, let’s first understand the memory model in JavaScript. This memory model visualizes how a JavaScript program is mapped in the RAM during execution.

Let’s take a look.

Memory Model

JavaScript has three portions of memory assigned to a program during execution: Code Area, Call Stack, and Heap. These combined are known as the Address Space of the program.

Code Area: This is the area the JS code to be executed is stored.

Call Stack: This area keeps track of currently executing functions, perform computation, and store local variables. The variables are stored in the stack in a LIFO method. The last one in is the first out. Value data types are stored here.

For example:

var corn = 95
let lion = 100

Here, corn and lion values are stored in the stack during execution.

Heap: This is where JavaScript reference data types like objects are allocated. Unlike stack, memory allocation is randomly placed, with no LIFO policy. And to prevent memory “holes” in the Heap, the JS engine has memory managers that prevent them from occurring.

class Animal {}// stores `new Animal()` instance on memory address 0x001232
// tiger has 0x001232 as value in stack
const tiger = new Animal()
// stores `new Object()` instance on memory address 0x000001
// `lion` has 0x000001 as value on stack
let lion = {
strength: "Very Strong"
}

Here, lion and tiger are reference types, their values are stored in the Heap and they are pushed to the stack. Their values in stack hold the memory address of the location in Heap.

Tip: Share your reusable components between projects using Bit (Github). Bit makes it simple to share, document, and organize independent components from any project.

Use it to maximize code reuse, collaborate on independent components, and build apps that scale.

Bit supports Node, TypeScript, React, Vue, Angular, and more.

Example: exploring reusable React components shared on Bit.dev

Memory management in JavaScript

Note: This applies to Nodejs because Node.js runs on a V8 JavaScript engine. So whatever happens in Nodejs happens in JavaScript.

In JavaScript, memory is divided into two: Heap and Stack.

Stack: This is the area in our JS program address space where primitives values and pointers to object references to the heap are stored. The Stack is an organized memory space managed by the OS. It uses the FILO(First In Last Out) principle.

The Stack is just like a pile of books stacked from bottom to top. To get a book you must pop off from the top, and you can never take a book anywhere stack except the top. In this way, the stack is highly organized.

Heap: This is the memory space where objects are stored. The objects are created here and their references are stored on the Stack.

Heap is not orderly organized as the Stack. Memory spaces are allocated and deallocated randomly and without a pattern. It is up to the JS runtime memory manager to alloc and dealloc to prevent holes in the Heap and stale objects(object without reference in the Stack).

The Garbage Collector in V8 is responsible for looking out for lost object references in the Heap and deallocating the space. It uses the Mark and Sweep algorithm to find and mark un-referenced objects and then it sweeps them out, which is deallocation. Let’s look at GC’s Mark and Sweep operation in more detail.

GC: Mark and Sweep

The Mark and Sweep operation start from the root of the application. For Nodejs, it is the global object and for the browser, it is the window object.

All variables stored in these roots are global variables. These global variables are marked as always present and active.

Mark and Sweep have two cycles: Mark, and Sweep.

In the Mark cycle: Global variables are marked as active. The children of these global variables are recursively inspected and everything that can be referenced is marked as active.

In the Sweep cycle, GC collects all variables not marked as active and frees their memory space.

We now know JS have a GC that sweeps and frees dangling objects, let’s now see what a memory leak is and how it can occur.

Memory Leak

A memory leak occurs when the developer declares a variable that points to a piece of memory and forgets about it. A memory leak can also occur when a variable is accidentally created in the wrong scope, this leaves the piece of memory dangling even after the intended scope is long GC’d.

So practically a memory leak occurs due to the developer’s fault, forgetting about a variable and wrong knowledge about scoping in JS. This all leads to a memory leak.

Let’s look at ways in which we can have a memory leak in Nodejs.

Scope

Scoping is one of the major causes of memory leak in Nodejs. Scoping in JS is generally tough to wrap your head around, even the highly seasoned devs sometimes declare variables in the wrong scope.

Memory leak in JS occur due to scoping rules. Let’s say we have this code:

function aFunc() {
foo = 900
}

A quick look at this code, we may think that the foo variable is created inside the scope of this aFunc function. But no, the foo variable is created at the global scope(that is Hoisting). So when the aFunc scope is destroyed after being called, the foo still exists in the global scope in the global variable. This is a memory leak.

This memory leak can introduce problems. If we have a global variable or global function with name foo. Then, this memory leak will cause the global foo to be overwritten by the foo in the aFunc.

We now see that scoping variables wrongly can lead to some serious errors in our code.

Even if another foo global variable exists, the foo in the aFunc will be left dangling.

To mitigate all these leaks, variables intended to exist inside functions, or objects are to be clearly defined with the keyword var, const let. These make them to be created inside the function's or object's scope, so when the scope is cleared the variables inside them are GC'd (garbage collected).

Global Variable

We learned that in JS’s GC MArk-and-Sweep algorithm that global variables are never collected as they are always marked as active. This poses a huge risk of memory risk if the dev assigns global variables and forgets about them i.e and never uses them. That blocks of memory would be uselessly occupied, and will eventually bloat up the memory space of the entire making the program eat up our machine resources and eventually slow down the program.

var aGlobalVar = 900
var anotherGloablVar = 9000
var aBigGlobalVar = new Array(1000)
var aBigVeryGlobalVar = new Array(1000000)

We have four global variables there. See the aBigGlobalVar and its brother, aVeryBigGlobalVar consumes a lot of memory space. aGlobalVar and anotherGlobalVar do not consume as much as their colleagues. aBigGlobalVar sets up an array with 1000 space for elements, aVeryBigGlobalVar sets a million spaces in its array.

If the aBigGlobalVar, and aVeryBigGlobalVar are never used and forgotten, that will mean a (1000000000) memory space size are completely occupied and unused. They are never garbage collected because they are global variables.

One way to solve this problem is to nullify your global variables after usage by using the null keyword. This will free up the memory space occupied by the variable thus, making your program a lesser RAM-eater.

Others are:

  • Use global variables sparingly
  • Don’t store heavy values in global variables
  • Always remember to clean out your no-longer-wanted global variables.

Caching

Caching is one of the ways we can leak memory in JS.

Caching comes in very handy when we want to speed up our program especially programs that involve huge mathematical computations that involve a lot of CPU power. It prevents the re-calculation of values, especially pure value. Now, caching speeds up computing power at the expense of memory. It stores computed values in-memory and return them immediately on demand without recalculating them.

This in-memory caching can expand out of hand, and introduce memory leaks if the stored values are never used. They will sit there dormant taking up precious space because their content cannot be collected.

The best way to mitigate the risk of memory leak in caching is to constantly do away with the never-used values in the cache.

DOM references

You may never know but using DOM references also creates memory leak. We can refer to an element DOM reference using any of the DOM APIs:

  • getElementById
  • getElementbyClassName
  • getElementByTagName

These APIs retrieves the DOM reference of the specified element and is stored in a variable. If the element is eventually destroyed or removed from the DOM, the removed element’s DOM reference is being kept alive in the variable it was stored earlier. In other words, the element was destroyed but yet it still lives in memory.

<body>
<p id="para1">P1</p>
<p id="para2">P2</p>
</body>
// DOM reference of "p" element with "id" "para1" is stored in the p1 variable
var p1 = document.getElementById("para1")
// Let's remove "para1" from DOM
document.body.removeChild(p1)
"para1" is removed from DOM but yet still exists in memory.// Test
p1.innerText
P1
p1
<p id="para1">

See? a block of memory is still occupied despite being removed from the DOM. Some developers will still think that the “para1” element is gone. We can still even append the “para1” element back to the body using the p1 variable:

document.body.appendChild(p1)

This example might not have a huge impact, but imagine removing a deep nested DOM and still having its DOM reference in memory. We will have precious large memory space occupied by the DOM reference and will cause the OS to create more memory for application leaving us with a large memory footprint.

One way to solve this issue is to delete or nullify DOM references kept in variables, this way the memory they occupied will be freed, giving our program with a small memory footprint.

Event listeners

Event listeners like “onclick”, etc can introduce memory leak in our application. Event listeners mostly have impure function handlers, they always depend on variables on their parent scope. This dependency on their parent scope variables makes these variables to be marked as active by GC and never collected.

Example:

var aGlobalVar = 900var onClick = (evt) => {
var result = aGlobarVar * Date.now()
}
buttonElement.addEventListener("click", onClick)

See, this “click” event handler is dependant on the aGlobalVar variable. This aGlobalVar will never be collected because an event listener uses it. So on the GC’s Mark-and-Sweep cycle, the variables live. This becomes a memory leak if the “click” event is readily registered on the button element but never used, or completely forgotten. This will make the aGlobalVar variable occupy precious memory space.

One way to avoid this is to clean up our event listeners when no longer in need of it.

buttonElement.removeEventListener("click", onClick)

This will free the aGlobarVar and GC can collect it.

Closures

Closures are the harbinger of memory leaks. Its nature makes it so. Closures are so smart that they can remember the variables kept in its parent scope and import it in its scope making them usable. That’s clever. But this parent scope isn’t kept in the space they are in the memory.

I would say things don’t die quickly in JavaScript. Kill it and it’s still alive in memory.

When the parent scope of a closure is removed, the closure retains a copy of it in memory so callers can still refer the old variables from the closure’s parent scope.

Example:

function noClosure() {
var cFoo = 900
function closureFunc() {
return cFoo
}
return closureFunc
}
var closure = noClosure()
closure()
// 900

The cFoo will still be kept in memory by closureFunc and returned.

Now, what happens if the parent scope closures are not used or some are not used?

function noClosure() {
var cFoo = 900
var cBaz = new Array(100000)
function closureFunc() {
return cFoo
}
return closureFunc
}
var closure = noClosure()
cloure()
// 900

See we have two closure variables cFoo and CBaz that would still exist in closureFunc scope. Only the cFoo is used and the big ol’ cBaz si kept in memory! That’s a huge memory leak.

Also, every time a new closure is created a 100000 memory space is allocated. This will result in a fast dwindling of resources allocated to our process and would eventually slow down execution because the OS would struggle to allocate more memory to our program.

We should be careful about how we create closures so as to disallow memory leaks.

Timers

Timers such as setInterval, setTimeout are surprisingly another source of memory leak in JavaScript.

Uncleared timers can run forever and would seize resources that should have been collected.

See this code:

var aGlobalVar = 900setInterval(() => {
aGlobalVar + Date.now()
...
}, 1000)

We have a timer set to run every 1s. It depends on the aGlobalVar variable. Now, this variable can’t be garbage collected because it is dependant on a timer. This will introduce a memory leak if the timer is forgotten and remains uncleared, the timer will run forever and aGlobalVar would never be collected.

A solution to this is to clear the timer with clearInterval API.

var aGlobalVar = 900var timer = setInterval(() => {
aGlobalVar + Date.now()
...
}, 1000)
// We are done, then clear the "timer"
clearInterval(timer)

This will free the aGlobaVar and can now be GC’d. If setTimeout is used, we will use clearTimeout to clear out the timer.

Note: setTimeout runs once on the specified time and doesn’t run again unlike setInterval, but its handler still remains active and in memory, so its outside dependencies are not released for GC. Using clearTimeout will deactivate and clear the setTimeout handler and free its dependencies.

Conclusion

Memory leaks are not to be taken lightly. No user wants RAM-eaters programs and would easily discard a program when one program slows down his machine. So we must take great measures and care to make sure we don’t up the RAM limit in our user’s machine which this post gives.

If you have any questions regarding this or anything I should add, correct or remove, feel free to comment, email, or, DM me.

Learn More

References

Get my eBook

I have written an eBook that explains a lot of JavaScript concepts in simpler terms with reference to the EcmaSpec as guide:

Social media

Follow me on:

--

--

JS | Blockchain dev | Author of “Understanding JavaScript” and “Array Methods in JavaScript” - https://app.gumroad.com/chidumennamdi 📕