Provisioning Custom CloudFormation Resources Using AWS Lambda

  • by Emre Yilmaz
  • Dec 15, 2017
  • AWS, Serverless, DevOps
  • Istanbul

AWS CloudFormation is the foundation of operational excellency on AWS. We code our infrastructure in JSON of YAML templates and test it as much as we need before deploying on production. It is simply infrastructure as code concept.

However, some new resources may not be supported by AWS at the same time they are launched. As of today, an example to these is Elastic GPU resource. The solution is to define a custom CloudFormation resource and attach this resource to a Lambda function which launches these resources. The Lambda function should also be in the same template and I will describe the process in this blog post.

CloudFormation Custom Resources and Lambda

While provisioning our resources using CloudFormation templates we may need some custom logic. Or we may need a resource type which is not supported by CloudFormation yet. This is where we use custom resources in our CloudFormation templates.

A custom resource is a resource which has Type as AWS::CloudFormation::CustomResource or Custom::<our-custom-resource-name> such as Custom::MyEC2. Another requirement is that it must have a ServiceToken attribute which can have value of arn of a Lambda function or SNS topic. During stack creation, deletion and update, CloudFormation sends requests to this service. In our case, it will trigger our Lambda function with some standard request inputs like RequestType, ResponseURL and some custom inputs we may define such as InstanceType, KeyPair, whatever.

When the Lambda function runs, it makes what it has to do according to the logic we coded, then it should send a response request to ResponseUrl with some required attributes and custom data that we can use later in our template.

This is how it works, let’s continue with an example to launch an EC2 instance which has Elastic GPU attached.

Provisioning EC2 Example

Lambda Function

First of all, we need to create a Lambda function. CloudFormation provides an input structure that we can use in our function. As you can see blow, our function uses Boto3 to launch, get and terminate instances we create in this stack.

We use “RequestType” key value to understand whether this function was called for create or delete action. If CloudFormation triggered this function with Create value in RequestType, we simply create a new Ec2 instance with an Elastic GPU attachment. We use ResourceProperties dictionary values that CloudFormation passed to Lambda function in event. Actually, we will define these values in our template. For now, the important thing is that CloudFormation sends our custom parameters in ResourceProperties.

We also created a function to send successful or failure responses to CloudFormation using StackId, RequestId, LogicalResourceId key values supplied by CloudFormation in Lambda event dictionary. Besides, we use log_stream_name attribute of context to construct PhysicalResourceId. This is standard. As you can see, CloudFormation supplies all of the required values to Lambda function. The only thing is to make an HTTP PUT request to the signed url provided with either SUCCESS or FAILURE status. An empty status also means failure.

One thing to note is response_data which is provided in Data attribute of the response our Lambda function returns to CloudFormation. In this example, we do not use at all. However, I added here to show that we can pass our custom response key-value pairs to use in CloudFormation in another resource. For example, we could return an “InstanceId” and then reference in our template like below.

{ "Fn::GetAtt": [ "CustomResourceName", "InstanceId" ] }

Anyway, we do not need this here and let’s see our final Lambda function:

import boto3
import json
import logging
import requests

# Logger settings - CloudWatch
# Set level to DEBUG for debugging, INFO for general usage.
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

client = boto3.client('ec2')

def handler(event, context):
    logger.info("Received event: " + json.dumps(event, indent=2))
    response_data = {}

    try:
        if event["RequestType"] == "Create":

            # Create the EC2 instance and attach an Elastic GPU
            # Tag with StackId to find during deletion
            response = client.run_instances(
                ImageId=event["ResourceProperties"]["ImageId"],
                InstanceType=event["ResourceProperties"]["InstanceType"],
                KeyName=event["ResourceProperties"]["KeyName"],
                MaxCount=1,
                MinCount=1,
                SecurityGroupIds=[
                    event["ResourceProperties"]["SecurityGroupId"]
                ],
                ElasticGpuSpecification=[
                    {
                        'Type': event["ResourceProperties"]["GPUType"]
                    },
                ],
                TagSpecifications=[
                    {
                        'ResourceType': 'instance',
                        'Tags': [
                            {
                                'Key': 'StackId',
                                'Value': event["StackId"]
                            }
                        ]
                    },
                ]
            )
            logger.debug(response)

            response_data["InstanceId"] = response['Instances'][0]['InstanceId']

        elif event["RequestType"] == "Delete":

            # find created instances using StackId
            response = client.describe_instances(
                Filters=[
                    {
                        'Name': 'tag:StackId',
                        'Values': [
                            event["StackId"]
                        ]
                    }
                ]
            )
            logger.debug(response)

            if response["Reservations"] and len(response["Reservations"]) > 0:
                instance_ids = [ i["InstanceId"] for i in response["Reservations"][0]["Instances"] ]

                logger.debug(instance_ids)

                # terminate instances the stack created
                response = client.terminate_instances(
                    InstanceIds=instance_ids
                )
                logger.debug(response)

        # Send SUCCESS response to CloudFormation to notify that resource creation is successful
        send_response( "SUCCESS", event, context, response_data)


    except Exception as e:
        logger.error( "An error occured: {}".format(e) )

        # Send FAILURE response to CloudFormation to notify that resource creation was failed
        send_response( "FAILURE", event, context, response_data)



def send_response(status, event, context, data):
    headers = {
        "Content-Type": ""
    }
    request_body = {
        "Status": status,
        "PhysicalResourceId" : context.log_stream_name,
        "StackId" : event["StackId"],
        "RequestId" : event["RequestId"],
        "LogicalResourceId" : event["LogicalResourceId"],
        "Data" : data
    }
    logger.debug(request_body)

    response = requests.put( event["ResponseURL"], headers=headers, data=json.dumps(request_body) )
    logger.info("Response status code: {}".format( response.status_code ))

After development, we need to package this function into a zip file and upload to an S3 bucket. We will use this bucket in our CloudFormation template. Please let me remind you that you should install requests package into the same directory where your Lambda function app.py resides,then select all files and folder and compress into a zip package.

CloudFormation Template

CloudFormation uses templates to provision our resources. These templates can be JSON or YAML and should have some required attributes like AWSTemplateFormatVersion, Resources. I will not dive into the details, please check AWS documentation which I included in References section below.

Our template will provision these resources:

  • A custom resource that triggers the Lambda function
  • An AWS Lambda function that our custom will trigger
  • An IAM role for our Lambda function with necessary CloudWatch, EC2 permisions
  "CustomServer": {
    "Type": "Custom::EC2InstanceWithElasticGPU",
    "Properties": {
      "ServiceToken": { "Fn::GetAtt" : ["LambdaFunction", "Arn"] },
      "Region": { "Ref": "AWS::Region" },
      "ImageId": { "Ref": "ServerImageId" },
      "KeyName": { "Ref": "KeyName" },
      "InstanceType": { "Ref": "ServerInstanceType" },
      "SecurityGroupId": { "Ref": "ServerSecurityGroup" },
      "GPUType": { "Ref": "GPUType" }
    }
  },

  "LambdaFunction": {
    "Type": "AWS::Lambda::Function",
    "Properties": {
      "Code": {
          "S3Bucket": { "Ref": "LambdaS3Bucket"},
          "S3Key": { "Ref": "LambdaS3Key"}
      },
      "Handler": "app.handler",
      "Role": { "Fn::GetAtt" : ["LambdaExecutionRole", "Arn"] },
      "Runtime": "python3.6",
      "Timeout": "30"
    }
  },

  "LambdaExecutionRole": {
    "Type": "AWS::IAM::Role",
    "Properties": {
      "AssumeRolePolicyDocument": {
        "Version": "2012-10-17",
        "Statement": [{
            "Effect": "Allow",
            "Principal": {"Service": ["lambda.amazonaws.com"]},
            "Action": ["sts:AssumeRole"]
        }]
      },
      "Path": "/",
      "Policies": [
        {
        "PolicyName": "root",
        "PolicyDocument": {
          "Version": "2012-10-17",
          "Statement": [{
              "Effect": "Allow",
              "Action": ["logs:CreateLogGroup","logs:CreateLogStream","logs:PutLogEvents"],
              "Resource": "arn:aws:logs:*:*:*"
          },
          {
              "Effect": "Allow",
              "Action": ["ec2:*"],
              "Resource": "*"
          }]
        }
      }]
    }
  }

You can supply LambdaS3Bucket, LambdaS3Key, ServerImageId, KeyName, ServerInstanceType, GPUType and ServerSecurityGroup values in Parameters section of your template.

While creating an Elastic GPU using aws-cli or Boto3 as in this post, AWS attaches the same security group ids to the Elastic Network Interface (eni) of the Elastic GPU and the EC2 instance. Hence you should include a rule to allow port 2007 access to other members of the same security group.

You can also create a ServerSecurityGroup resource to grant RDP (port 3389) access and then a circular ingress rule as below. Again, you can get VPC value in Parameters section of the template during stack creation.

  "ServerSecurityGroup": {
    "Type": "AWS::EC2::SecurityGroup",
    "Description": "Security group for Windows servers",
    "Properties": {
      "VpcId": { "Ref": "VPC" },
      "GroupDescription": "Security group for windows servers",
      "SecurityGroupIngress": [
        {
          "CidrIp" : "0.0.0.0/0",
          "FromPort": "3389",
          "ToPort": "3389",
          "IpProtocol": "tcp"
        }
      ],
      "SecurityGroupEgress": [],
      "Tags": [
        {
          "Key": "Name",
          "Value": { "Fn::Sub": "${AWS::StackName}-sg-windows-server" }
        }
      ]
    }
  },
  "ServerSecurityGroupIngressCircular": {
    "Type": "AWS::EC2::SecurityGroupIngress",
    "Description": "Security group ingress rule for Elastic GPU communication",
    "Properties": {
      "GroupId": { "Ref": "ServerSecurityGroup" },
      "SourceSecurityGroupId": { "Ref": "ServerSecurityGroup" },
      "FromPort": "2007",
      "ToPort": "2007",
      "IpProtocol": "tcp"
    }
  }

Deployment result

When we make an RDP connection to our instance, we can see that Elastic GPU was attached successfully to our Windows EC2 Instance.

Elastic GPU on Windows EC2 Instance

Conclusion

Infrastructure as code concept is one of the most important fundamentals of operational excellency on AWS. With custom resources and AWS Lambda, we have the ability to provision any resource we like as long as AWS SDKs support.

Thanks for reading!

References

...

CEO @ Shikisoft

AWS Certified Solutions Architect & DevOps Engineer - Professional

Follow