Running AWS Lambda Functions in a VPC and Accessing RDS

by Emre Yilmaz
Oct 23, 2017

AWS • Serverless • Databases • AWS Lambda • Amazon RDS
Istanbul

Running AWS Lambda in VPC and Accessing RDS

AWS Lambda allows us running code without maintaining servers and paying only for the resources allocated during the code run. In most cases, we do not need to run our Lambda function in a VPC and it is recommended not to run in a VPC in these situations as a best practice. Some of examples to these are sending emails using Amazon SES or accessing a DynamoDB table.

However, to access an RDS database in your Lambda function you have to run your function in the same VPC or in a VPC that has a peering connection to the VPC of your RDS instance. In this blog post I would like to discuss about running AWS Lambda functions in a VPC and accessing a RDS MySQL database.

What is AWS Lambda?

As this is my first post about AWS Lambda, let me give you a brief introduction to AWS Lambda. AWS Lambda is a compute service provided by AWS that lets us develop small functions in supported programming languages and run them according to specified events without maintaining our servers and paying only for the resources allocated when the code is run.

As can be seen, benefits are obvious. First of all, we do not need to provision or maintain our servers to run our functions. Actually every Lambda function is running in its own container and has predefined resources such as memory, maximum running time.

Secondly, it is cost effective. For example let’s assume that we need an API. We can develop our api using Ruby on Rails or another framework, but we need to provision EC2 instances that are always in running state even there is no access to them. However, we can also create an API using AWS API Gateway which triggers AWS Lambda functions for each HTTP request and pay only when the API is called.

Third, as in our API Gateway example, AWS Lambda is integrated with many AWS services. We can trigger our Lambda functions when an API call is made, an object is uploaded to our S3 bucket, an item is created in our DynamoDB table, etc…

AWS Lambda has some limits. Maximum running time for a Lambda function is 5 minutes and AWS simply stops its execution when it exceeds this time. We define how much memory should be allocated for each run and it can be 1536Mb at maximum. Hence, functions should be small by design and do only one work.

Although having these differences from an EC2 instance, an AWS Lambda function acts more like an EC2 instance during the code run. We attach an IAM role to our function to have access to other AWS resources as we do in our EC2 instances. Accessing a resource in a VPC is no different than an EC2 instance accessing the same resource. We define security groups, select subnets to run the code and our function should have been granted access by the resource we try to access.

Accessing an RDS Instance from an AWS Lambda function

In this section, I will demonstrate an example to run our Lambda function in a VPC and access an RDS instance running in one of our private subnets. We will define our Lambda function to also run in the private subnets in our VPC.

Prerequisites

I will not dive into creating a VPC with public and private subnets, creating an RDS instance. Therefore, we need some prerequisites to be defined:

We have a VPC with 2 or 3 private subnets, 2 or 3 public subnets according to number of availability zones in the AWS region we choose.
We have an Amazon RDS MySQL instance that has our private subnets as database subnet groups. In this RDS instance we have a database and a table which we are trying to get access.
All traffic between the private subnets in our VPC is allowed in their NetworkACLs.

Creating a Security Group for our AWS Lambda Function

You can use an existing VPC security group for your Lambda function. However, I recommend to create a separate Security Group in your VPC to separate its security from others. Therefore, we go to VPC dashboard and create a Security Group with no ingress rules because there will be no incoming connection to our Lambda function. We give a name like my-lambda-sg and note its security group name.

Make sure to select the VPC that your RDS instance were launched into. We will use this security group when we create our Lambda function.

Creating a security group for your AWS Lambda function

Granting access to Lambda function’s Security Group in RDS instance’s Security Group

To grant access to RDS instance for the members of our Lambda function’s Security Group; we need to define ingress rules for our RDS instance database port in our RDS instance’s Security Group. Here, be sure that you selected your Lambda function’s security group in the Source field.

Granting access to AWS Lambda function Security Group in RDS instance Security Group

Create an IAM role with necessary VPC permissions

When defined to run in a VPC, AWS Lambda function needs to create an Elastic Network Interface. Hence, our Lambda function needs to be attached to an IAM role that has these basic VPC permissions. AWS has a defined a managed policy for this and we will create an IAM service role for AWS Lambda and attach this policy.

We go to IAM Management Console and click Create Role. In AWS Service tab, we select Lambda by clicking on it and click Next Permissions

Creating an IAM role for the AWS Lambda function

In Attach permissions policies section, we search for AWSLambdaVPC and select AWSLambdaVPCAccessExecutionRole and click Next: Review.

Attaching VPC permissions in AWS Lambda function's IAM role

We give the role a name such as lambda-vpc-execution and create. This role will grant necessary permissions to our Lambda function for VPC and CloudWatch Logs. AWS Lambda prints logs to AWS CloudWatch Logs.

You can also grant other permissions such as S3 bucket access if you need to do so. For now, we are fine with this setting.

AWS Lambda function code example

For this post, I developed a small Lambda function using Python that returns all records from a table in a database in our RDS instance.

The app.py file code is as follows:

import logging
import pymysql
import json
import os

# Logger settings - CloudWatch
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

# Connect to the RDS instance
logger.info("Connecting to MySQL database")
conn = pymysql.connect(host=os.environ['db_host'], port=int(os.environ['db_port']), user=os.environ['db_user'], passwd=os.environ['db_password'], db=os.environ['db_name'],connect_timeout=3)
logger.info("SUCCESS: Connection to MySQL database succeeded")

def handler(event, context):

    logger.info("Received event: " + json.dumps(event, indent=2))

    with conn.cursor() as cur:
        select_statement = "select * from `{}`;".format( os.environ['table_name'] )
        logger.debug(select_statement)
        cur.execute(select_statement)
        result = cur.fetchall()
        logger.debug("Result: " + json.dumps(result, indent=2))

    return result

Here, we applied some of the best practices. First of all, we get database connection settings from environment variables that will be defined in our Lambda function. We should do this as possible as we can to be able to reuse code and not to hard code our database credentials for security.

We have a handler function handler that will be starting function when our Lambda function us triggered. We can rename it something else, then we have to tell Lambda the name of the handler function when creating.

Besides these, we put the connection code outside of the handler function. Therefore, subsequent runs will not need to re-establish MySQL connection each time.

Packaging our Lambda function

We should package our Lambda function code before creating it. As you can see, we import logging, pymysql, os and json modules. By default Lambda has logging, os and json modules installed; however, pymysql not. We need to download it to our package.

To do this, we install pymysql module in the same directory in our workstation where our Lambda function resides:

The example for Unix based systems is below.

 $ pip install pymysql -t .
Collecting pymysql
  Using cached PyMySQL-0.7.11-py2.py3-none-any.whl
Installing collected packages: pymysql
Successfully installed pymysql-0.7.11

This will install pymysql module and create pymysql and PyMySQL-0.7.11.dist-info folders in the same directory. Your distribution version number may differ, but the process is same.

Finally, we create a zip package by including all files and folders in our Lambda function folder. Here, make sure that you do not zip the enclosing folder, you should select all files and package them.

Creating the Lambda function and uploading the code

In this section, I will show you to create a Lambda function using AWS console and uploading the code. Then in next section I will discuss VPC and security groups.

We go to Lambda Management Codole, click Create Function. We click Author from scratch not to use blueprints. Then, we give a name to our function and select Choose an existing role in Role list and select the role we created for our Lambda function in previous sections. After that, we click Create function.

AWS Lambda Function Creation on AWS Lambda Console

In Configuration tab, we should rename the Handler accordingly. In our case, we define it to be app.handler, because our Python script’s file name is app.py and function name is handler in this script.

We select Python 3.6 for Runtime list and Upload a .ZIP file from Code entry type list. Finally, we click Upload to select the zip archive that we created in previous section.

Uploading the Python package to the AWS Lambda function

AWS Lambda Environment Variables

Although not mandatory, recall that as a best practice we defined database connection parameters as environment variables. To define these parameters, we expand Environment variables section by clicking and enter db_host, db_name, db_port, db_user, db_password and table_name variable names and provide them values.

You can skip this section if you hardcoded your database connection parameters in your Lambda function, altough I do not recomment to do so.

Network and VPC Settings

Here is the core of this blog post. We will define VPC settings for our Lambda function. While we created our RDS instance we provided a VPC and selected a subnet group that the instance would be launched into. We will also define our Lambda function to run in these private subnets and have access to the RDS intance.

To achieve this, we expand Network section and select our VPC in VPC list. Then we select all private subnets in Subnets list.

Finally, we select the security group we created for our Lambda function from Security Groups list. Settings should be similar to below.

Of course, we click Save to complete the creation process.

Testing our AWS Lambda function

To test the function, we click Test button and Configure test event window will be opened. Simply, we give a name like ‘myTest’ and an empty input {} as test data and click Create.

Creating a test event for the AWS Lambda function running in a VPC

After the test event created, we click Test button and our Lambda function should run successfully as below.

Test result for the AWS Lambda function running in a VPC

You can define different test events and conduct multiple tests for your Lambda function.

Conclusion

Running an AWS Lambda function in a VPC and accessing to an RDS database in it involves many steps such as defining security groups, creating IAM roles, etc. In this post, I tried to explain these steps with an example to access an RDS database.

If your Lambda function’s MySQL connection times out; check your VPC settings and verify that your RDS instance’s subnets has correct NetworkACLs and Security Group has correct permissions to grant access to your Lambda function. If you know VPCs well, running Lambda in a VPC is no different than running an EC2 instance in a VPC.

I hope it was useful for you. In future posts, I am planning to dive into providing Internet access to AWS Lambda functions in a VPC.

Thanks for reading!