In recent days, I made some trials for connecting to MongoDB databases from AWS Lambda functions using Python. In today’s post I will share my experiences with you and take some notes about these trials for future reference. We will install MongoDB on an EC2 instance and develop simple Python functions to access it. Let’s start!
About MongoDB on AWS
Unlike some relational databases such as MySQL or PostgreSQL, MongoDB has not been offered by AWS as a managed solution yet. Although AWS provides some whitepapers and examples on how you can deploy MongoDB on Amazon EC2 instances, they prioritise DynamoDB over it as a no-SQL database. Therefore, you need to have a MongoDB server installed and running on an Amazon EC2 instance.
Installing MongoDB on Amazon Linux 2
First of all, let me note that this post is not about how you can deploy MongoDB on EC2 according to best practices. I will only prepare a MongoDB database to make some trials and make connections from AWS Lambda functions. As we made our note, let’s continue and launch an Amazon EC2 instance with Amazon Linux 2 operating system.
Before installing MongoDB, we need to add MongoDB repository to install version 3.6.
Then, add these values below and save.
And install MongoDB.
In default mode, MongoDB accepts only requests from the local instance. Actually this is very similar to MySQL server. To allow connections from outside such as AWS Lambda we should remove this constraint. We will protect connections to our MongoDB instance using VPC security groups.
In the /etc/mongod.conf
file, you will see a bind 127.0.0.1
which should be updated as bind 0.0.0.0
. I tried commenting it out but it was useless and connections failed. This will allow all connections from outside.
After making this change start MongoDB:
After completing these steps, your MongoDB server will be ready to accept connections. However, we should create a Security Group to allow AWS Lambda but deny other connections from port 27017
which is the default port of MongoDB.
Security Groups for AWS Lambda and MongoDB
If you recall my post about accessing RDS from AWS Lambda, you can remember that we need to define security groups and subnets for our AWS Lambda function. It is because our MongoDB is inside a VPC.
Now, create two security groups as below:
- A security group for AWS Lambda functions with no inbound rule. Let’s call it
lambda-sg
. - A security group for MongoDB instances allowing custom TCP port
27017
inbound connections from the Lambda security group (lambda-sg) and let’s call itmongodb-sg
.
While deploying your AWS Lambda functions attach lambda-sg
to them and select all subnets in the same VPC of your MongoDB instance. You should be able to connect MongoDB instance from your AWS Lambda function using port 27017
.
AWS Lambda functions
We created our MongoDB database and now, let’s implement simple AWS Lambda functions which create, update, delete and retrieve users from our database. In these functions, we will use PyMongo
module which provides tools to work with MongoDB databases.
Deployment notes
Deployment of lambda functions in this example is no different than any function that accesses VPC resources. You should attach an IAM role having AWSLambdaVPCAccessExecutionRole
managed policy, appropriate security groups and subnets to your functions. If you need more information, this previous blog post might be useful for you.
AWS Lambda function for creating users
As you can see, it is a simple function which assumes that first_name
, last_name
and email
attributes are all provided to event
when calling the function.
MongoDB creates databases and collections lazily. If they do not exist during the creation of a document, it creates them at that stage. In the code, we construct a user document, provide it to the collection’s insert_one
method and return the created id using the inserted_id
attribute from the response. In the end, we retrieve the created document from the database using this id to verify that it is created and provide in the response.
Python’s json
module is unable to serialize ObjectId
values which MongoDB documents’ _id
attribute is a member of. To solve this issue, we need to use a helper method (json_unknown_type_handler
) when returning the created document in JSON format.
In the part initializing the collections you see users = db.users
. Here, users
on the right is the collection name and this expression is equal to users = db["users"]
. It is simply used to access the collection named users
in MongoDB.
A note on database host IP address
We get the MongoDB host IP from DB_HOST
environment variable. You should set this environment variable to the private IP
of your MongoDB instance while deploying your AWS Lambda function.
Why should you use private IP instead of the public one? Well, if you provide the public IP address, AWS Lambda function has to traverse the Internet to access your instance and it will timeout as you did nothing to configure its Internet it is best to use the private IP address.
AWS Lambda function for updating users
This function uses the email of the user to find the document in MongoDB database and updates it with first_name or last_name values provided. I will only include handler
method below as the other parts are same as the create function.
Here, I kept the function simple and did not implement any validations, but you should definitely add them if you are developing for production.
In the function update_one
method does the trick. It takes three parameters:
- A filter in JSON format to find the document. Here we filtered using the email, but we can provide
_id
value as well. Then the filter would become as below.
- The attributes to be updated and their values in a JSON object with
$set
attribute. For example, if you provide onlyfirst_name
with valueEmre
this becomes:
- Whether an insertion will occur if the document is not found in the database. I chose not to do it using
upsert=false
.
In the end, the function retrieves the updated document and returns.
AWS Lambda function for retrieving a single user
In this function, I will use the email of the user to retrieve it from the database. It is a short, simple function. Again, I only included handler
part of the Lambda function.
This function uses find_one
method of the collection. Actually, we used this in the previous functions. It gets a filter in JSON which we used email
attribute for.
AWS Lambda function for deleting a single user
To delete the user, I will use him/her email. It is a short, simple function and I am including all of it as it will only delete and return a response for successful operation.
delete_one
method is used for deletion and it is very similar to find_one
. It gets a filter in JSON and again, you can use _id
instead of email
.
Additional notes for Python and MongoDB
Listing documents in a collection
Retrieving all documents form MongoDB using Python is simple:
To filter the documents, just provide a filter into the find
method:
This code retrieves and prints all user documents having Emre
as the first name.
Adding a new attribute
Unlike relational databases, NoSQL databases are very convenient if your documents have different numbers of attributes. You can easily add a new field while updating the document. For example, let’s say that our document does not have an attribute named link
and we would like to store link
information for some users. All we need to do is to add a new attribute and its value in $set
attribute in update_one
method.
Removing an attribute from a document
Adding a new attribute was simple, right? So is removing an existing one. Now, let’s say that we need to remove the link
attribute previously added. In this case we use $unset
method and as the name suggests, it unsets a set attribute. We provide the attribute name with an empty value and it’s gone.
Connecting to a MongoDB instance using authentication
In my examples, I mostly used unauthenticated connections and protected the connection between AWS Lambda functions and the MongoDB instance using VPC security groups. However, it is a good practice to secure your environment in all layers. So, in case you activate authentication on your MongoDB instance, the only thing you need to do is editing your client’s initialization line.
This connection retrieves username and password values from DB_USERNAME
and DB_PASSWORD
environment variables respectively.
Conclusion
It was an enjoyable work for me to set up a MongoDB database on an EC2 instance in minutes and try Python’s pymongo
module within AWS Lambda. Nowadays, NoSQL databases are very popular and MongoDB is among them.
While deploying these Lambda functions, I used AWS Serverless Application Modal to simplify and automate the process and plan to discuss it soon along with SAM CLI.
I hope it was a useful post and you enjoyed it, too.
Thanks for reading!