Optimizing Serverless in AWS

Tips to improve Serverless Web App Performance in AWS Cloud Platform

Chameera Dulanga
Bits and Pieces

--

Photo by Louis Hansel @shotsoflouis on Unsplash

Serverless has become one of the hottest topics in the development world right now. There are many reasons for this popularity.

One of the core reasons is that Serverless takes care of the underlying complexities such as scaling, security, availability while allowing you to focus on your code.

However, optimizing a Serverless application requires specific expertise. Besides, there are many obsolete methods out there due to the rapid innovations involved with Serverless. In this article, I will be discussing modern approaches to optimize Serverless web apps for better performance.

1. Avoid Lambda Cold Starts with Provision Concurrency

When developing APIs using Serverless services in AWS, we use API Gateway and Lambda to run our backend code. And these functions start upon receiving events from API Gateway.

Lambda has become very cost-effective due to this on-demand invoking mechanism. On the other hand, the time taken to start the function directly impacts application performance.

AWS provides two methods to overcome this problem, and you can choose one based on your requirements and cost.

Using Lambda Ping

The first solution is to use a CloudWatch event rule and scheduled event to ping the Lambda function regularly (every 5 min or 10 min).

CloudWatch Event Rule

Usually, the Lambda execution environment has about 10 to 15 minutes of idle time, and if there were no Lambdas invoked during that time, the execution environment is removed. So, with this approach, we can keep the execution environment up and running continuously.

It’s essential to keep an exit path in your code to handle the scheduled event and execute it faster, reducing the execution cost.

Using Lambda Provision Concurrency

The second solution is to use a Lambda provision concurrency to create pre-warmed execution environments. You have the freedom to decide the number of pre-warmed environments you need.

Once you define the number of pre-warmed environments, the environment downloads the Lambda code with dependencies and initialize it. So, whenever we get a request from the API gateway, the Lambda service avoids provisioning execution environments repeatedly and uses one of the pre-warmed environments.

Make sure you only apply provision concurrency for frequently invoked Lambda functions since AWS provision concurrency costs are high.

2. Use Amazon RDS Proxy to Manage Database Connections

Traditionally web apps use a limited number of database connections, and these connections get shared among multiple requests using a connection pool.

Serverless applications can’t follow the same mechanism since Lambda functions act as isolated units and don’t share data. Therefore each Lambda function creates a new database connection.

Connecting Multiple Lambda Functions with RDS

On the other hand, Lambda functions can auto-scale when there is a heavy load, and sometimes there can be hundreds of concurrent functions.

In such cases, your RDS instance will get overwhelmed with opening and closing a large number of DB connections, and the performance of your application will decrease drastically.

You can avoid these kinds of situations by using Amazon RDS Proxy. Amazon RDS Proxy acts as an interface/ DB connection pool between your Lambda functions and RDS instance and efficiently manages the database connections.

Connecting Multiple Lambda Functions with RDS using Amazon RDS Proxy

RDS Proxy can handle many DB connection requests by putting them in a queue or rejecting them. Although this queuing mechanism increases the request time durations, it’s nothing compared to a connection failure.

Apart from that, RDS Proxy allows you to easily switch between databases in case of emergency while preserving active connections and authorize using IAM roles instead of using DB credentials in code.

3. Using Right Lambda Memory Size

The memory size of Lambda functions ranges from 128MB to 3GB. You can’t allocate whatever size you need because many things depend on the memory size you decide.

So, how can you decide the best memory configuration?

You can start with the smallest memory footprint, run the function, and refer CloudWatch logs to observe the consumed memory. Then you can gradually increase the memory as needed. However, for the demonstration, I’ve started with 1024MB memory.

CouldWatch Logs

As you can see in the above logs, only 124 MB has been used from the allocated memory size of 1024MB. So we can reduce memory allocation.

This isn’t that straightforward since Lambda memory size has a clear connection between CPU allocation, running time, and function scalability.

Instead of configuring this manually, you can use a specialized tool to decide the best memory allocation to achieve optimized execution time and minimum cost.

AWS Lambda Power Tuning is a third-party tool (which uses AWS step functions) to execute your function with varying memory configurations and measure the execution time.

The tool requires the ARN of your lambda function as the input and automates the activities like sending HTTP requests, SDK calls, cold starts in your AWS account, and analyze the logs to recommend the best configuration.

Generated Visualization from Lambda Power Turing ( Source: GitHub)

You can also generate visualizations using Power Turing Tool, and according to the above graph, the most efficient and cheapest memory allocation is 1.5 GB (different example).

Tip: Share your reusable components between projects using Bit (Github).

Bit makes it simple to share, document, and reuse independent components between projects. Use it to maximize code reuse, keep a consistent design, collaborate as a team, speed delivery, and build apps that scale.

Bit supports Node, TypeScript, React, Vue, Angular, and more.

Exploring shared components on Bit.dev

4. AWS Hyperplane to Improve Network Performance

You can run Lambda functions in AWS managed VPCs (visible to the user as No VPC assigned). However, there are use cases where you need to put the Lambda function inside a VPC to access resources within the network.

However, when we place a Lambda function inside a VPC, each function is assigned with an Elastic Network Interface (ENI) to communicate with other resources. But there are several drawbacks of using an ENI as follows;

  • Lambda concurrency will be limited since there is a maximum number of ENIs available.
  • The creation time of the ENI adds extra time to Lambda invoking time.

AWS addressed these drawbacks at the end of 2019 by introducing an internal functionality called AWS Hyperplance, a networking virtualization platform capable of creating connections between Lambda’s VPC and user’s VPC.

How Hyperplane NAT works

With AWS Hyperplance, ENIs are created with the lambda function creation time. Therefore, the delay is avoided at runtime.

Similarly, the limited number of ENI connections has been resolved by allowing to share ENIs among multiple execution environments. Speciality here is that this inbuilt functionality is focused to reduce Cold Start time and we don't have to any additional things.

Conclusion

The enhancements discussed are generally applicable for any Serverless web app project. Therefore, these improvements will give a significant boost to your app.

However, these aren’t the only practices available. You can look at caching Serverless frontend files, serve them through CDN, etc., to further optimize your Serverless app.

I hope the article was useful and if you have any questions, ask in the comments below.

--

--