Wednesday, February 11, 2015

AWS CodeDeploy: An In-Depth First Look

At Amazon's 2014 re:invent conference in Las Vegas, they announced CodeDeploy, a tool designed to simplify the process of deploying applications to groups of servers, sometimes numbering in the hundreds. The primary objective of CodeDeploy is to make deployments consistent, repeatable, and integrated with existing AWS services (you can complain about vendor lock-in now, but AWS is doing a great job of providing value for that lock-in).

I took a few hours to setup CodeDeploy and documented issues I ran into. This post is a result of a few hours of playing with the service and trying to get it running on a Ubuntu 12.04 Server (despite 14.04 being the only "officially" supported version.

First Impressions

At first glance, CodeDeploy really seems like a game changer; it's built by Amazon, integrated with their services, and a convenient way to do rolling, all-at-once, or grouped deployments. Once I started working with CodeDeploy, it felt like a solid product once I got past the first few issues. Of course, given its recent release, it also lacks a lot of support or online discussion, which left me manually digging through error logs and support forums for dependencies. While the documentation is pretty decent, there are currently only about thirty questions in the AWS forums about CodeDeploy. The biggest issue I found was that I had to manually add an alternative source for ruby2.0 on Ubuntu 12.04 and install it myself before continuing - but this was not the fault of CodeDeploy.

IAM Setup

CodeDeploy requires a moderate amount of setup to get working properly. The biggest error-prone aspect is creating the appropriate IAM roles for both the CodeDeploy service and the instances. First, I created the CodeDeploy IAM role with the following policy:

{
"PolicyName" : "AWSCodeDeployPolicy",
"PolicyDocument" : {
"Statement": [
{
"Action": [
"autoscaling:PutLifecycleHook",
"autoscaling:DeleteLifecycleHook",
"autoscaling:RecordLifecycleActionHeartbeat",
"autoscaling:CompleteLifecycleAction",
"autoscaling:DescribeAutoscalingGroups",
"autoscaling:PutInstanceInStandby",
"autoscaling:PutInstanceInService",
"ec2:Describe*"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
}

This allows CodeDeploy to access the tags and autoscaling groups it needs to in order to create applications and deployment configurations.

Next, I created the instance IAM role. It is important to remember that the CodeDeploy service needs access to the autoscaling and EC2 resources listed above while the instance itself only needs access to the S3 bucket containing the CodeDeploy agent and whatever bucket you store your final compressed file in.

{
"Effect" : "Allow",
"Action" : [
"s3:Get*",
"s3:List*"
],
"Resource" : [
"arn:aws:s3:::aws-codedeploy-us-east-1/*",
"arn:aws:s3:::your-bucket/path/*"
]
}

Here's a good place to tell you what I did wrong. Being security conscious, I thought I could get away with giving the instance role GetObject permissions only. My existing deployment strategy only requires this permission to pull the file from S3. However, apparently CodeDeploy tries to list the file and its ACL before downloading, which results in an error without the additional permissions. Lesson learned.

The CodeDeploy Agent

The next step was to get the agent installed on the Ubuntu Server instance. Amazon provides its own "Amazon Linux" if you're looking for an officially AWS-supported AMI, but I'm much more familiar with Debian-based distros, so I chose to stick with that. When you launch your instance, make sure you either give it a descriptive tag or place it in an autos-scaling group.

Installing the CodeDeploy agent on Ubuntu 12.04 proved to be a bit more difficult than the documentation reveals for 14.04 (again, not the fault of CodeDeploy, just my own need to use an older version). According to AWS, all you have to do is run:

sudo apt-get update
sudo apt-get install awscli
sudo apt-get install ruby2.0
cd /home/ubuntu
sudo aws s3 cp s3://bucket-name/latest/install . --region region-name
sudo chmod +x ./install
sudo ./install auto


However, if you try that, you'll notice it fails at the third line with:

E: Couldn't find any package by regex 'ruby2.0'

There is a yet-unanswered forum post about this here.

I decided to get Ruby installed another way. After getting it installed via rvm and rerunning the install script, it failed again, this time with:

"Dependency not satisfiable: ruby2.0"

So, I finally installed Ruby by adding an alternative source from Brightbox as documented here. In case that's ever not available, here were the steps:

sudo apt-get install software-properties-common
sudo apt-add-repository ppa:brightbox/ruby-ng
sudo apt-get update

Finally, I ran the install script yet again and it worked!

Preparing the Application

The application I wanted to deploy was a simple Node.js web app. It runs on the server using "forever," a daemon that keeps the process running in the background. To prepare it for CodeDeploy, I had to add an appspec.yml file and two scripts: a start and stop script.

The appspec.yml file looked very simple:

version: 0.0
os: linux
files:
  - source: /
    destination: /usr/local/projects/source
hooks:
  AfterInstall:
    - location: deployment/stop.sh
      runas: root
  ApplicationStart:
    - location: deployment/start.sh
      runas: root

Keep in mind that the YAML file is super-particular about spacing. There's an entire section devoted to it on the AWS docs.

Next, I added the start and stop scripts to the deployment directory of the project. Obviously they can be much more complex than this, but I'm trying to keep it relatively simple:

start.sh:

#!/bin/sh
forever start /usr/local/projects/source/server.js --flags --here;

stop.sh:

#!/bin/sh
forever stopall

Like I said, super simple, but it works. The basic premise of this is that CodeDeploy will execute each file that you provide during the correct lifecycle event, as defined by the appspec file. Besides "AfterInstall" and "ApplicationStart," there are also "ApplicationStop," "BeforeInstall," and "ValidateService." AWS provides explanations here, but keep in mind that "Install" purely means copying files to the right directories. In my example, "AfterInstall" means that CodeDeploy will wait until the files have been copied before stopping the previous running instance.

Once all of this has been done, create a compressed file of your choice (zip, tar, and tar.gz are supported on Linux, zips for Windows). Put the file in the same S3 bucket that you gave your instance permissions to earlier.

CodeDeploy Console

Within the AWS console, you can now setup your application. To do this, head to the CodeDeploy page and create a new application. Provide a name and a deployment group name. The console doesn't make this clear, but the difference is that you can have multiple deployment groups belonging to an application. For example, you could have an app called "node-app" and create a "node-app-a" deployment group and then later create a "node-app-b" group which would help with A-B style deployments.


In the tags section, enter either the autoscaling group or the tags you created earlier. If everything is successful, you should see the instance count increase.

The next section, Deployment Configuration, allows you to determine how you want your apps deployed. This is not really relevant when you only have one server, but it becomes very helpful if you have multiple servers. If you choose "one at a time," AWS will go to each server, attempt to deploy your app, and stop if any servers fail along the way. With all at once or half at a time, CodeDeploy will run in parallel accordingly. This is a much faster, but also much more dangerous option.


The service role should be the role created earlier with the necessary permissions. This role can be re-used for every application, as the permissions are the same regardless.

Finally, the application can be created. The next page is a bit confusing because it does not contain any action buttons. Instead, it says to use the command line to upload an application. Instead of doing that, head back to the main CodeDeploy page and click on "Deployments."

On this page, select your application from the list, then select the group name, paste the full S3 URL to your source into the box, and select your deployment method. Then, click deploy.



You can then see the results of the deployment.


Potential Issues

Besides the ruby dependency issue I mentioned above, I also ran into a very ambiguous error message:

UnknownError: Not Opened for Reading

This message really didn't tell me what was happening, but after logging into the instances, going to the /opt/codedeploy-agent/deployment-root directory and finding that all of the source files contained an XML error from S3 instead of the actual files, I was able to debug it. Be sure that you use all of the permissions listed above for the instance role or you might run into the same problems.

Other Options and Thoughts

Besides deploying from an S3 object, you can also tie into GitHub. While this could work, I prefer to have a 100% working source file before actually deploying to an instance. I still use Jenkins to pull my changes from GitHub, install dependencies (node modules for my apps), run tests, zip everything up, and put it on S3. Once that's done, I can launch a new deployment from the console, or even have Jenkins use the AWS CLI to launch a deployment pointed at the file it just uploaded. While I could certainly install node modules as part of the pre-install hooks on the instance itself, that is much more error prone and slower as well.

UPDATE: AWS has also informed me that there is an open-source Jenkins CodeDeploy plugin available. I've installed the plugin and it works quite nicely; you can easily specify the application name, deployment group, and deployment policy from within Jenkins. Then, it executes as a build step with the same exit codes as Jenkins. Essentially, you can push to GitHub, copy to Jenkins, run tests, then execute a CodeDeploy deployment all as a result of one push (assuming you have the appropriate webhooks).

Overall, CodeDeploy worked very well once I got it working. I was able to deploy my app multiple times in a row without issues and even tested out the "one at a time" feature with an autoscaling group. Everything worked as expected. While I don't think CodeDeploy will be a complete replacement for a tool like Jenkins or other CI suites, it does remove the last few steps and make them more tightly integrated to AWS. I highly reccommend you try CodeDeploy out, but definitely do it in a test environment first until you have the process down to a science.

Wednesday, January 28, 2015

NYC Blizzard 2015

I am going to break with my traditional technology-based posts to share some images I took during the "blizzard" this past Monday. NYC only wound up getting about eight inches of snow - a lot less than expected.



























Thursday, January 15, 2015

Using IAM Roles and S3 to Securely Load Application Credentials

Many applications require certain information to be provided to them at run-time in order to utilize additional services. For example, applications that connect to a database require the database connection URL, a port, a username, and a password. Developers frequently utilize environment variables, which are set on the machine on which the application is running, to provide these credentials to the underlying code. Some developers, against all recommendations, will hard-code the credentials into an application, which then gets checked into git, distributed, etc.

Ideally, the credentials required by an application should not be hard-coded at all, or even accessible to processes outside of the one running the application itself. To achieve this, the application must determine what additional credentials it needs and load them prior to starting its main command.

Many applications that run on Amazon's Web Services platform have the added advantage of being hosted on EC2 instances that can assume specific IAM roles. An IAM role is essentially a definition of access rights that are provided to a particular AWS resource (an EC2 instance in this case). AWS takes care of generating temporary credentials for that instance, rotating them, and ensuring they are provided only to the assigned instance. Additionally, the AWS command line tools and various AWS language-specific SDKs will detect that they are being run on an instance using an IAM role and automatically load the necessary credentials.

As developers, we can take advantage of IAM roles to provide access to credentials that are stored in a private S3 bucket. When the application loads, it will use its IAM role to download the credentials and load them into the environment variables of the process. Then, wherever they are needed, they can simply be called by accessing the environment variable that has been defined.

As an example of this setup, here are the steps I would take to run a Node.js web server that requires some database credentials:

1. Create an S3 bucket called "organization-unique-name-credentials".

2. If you plan to have multiple applications, create a new folder for each within the bucket: "organization-unique-name-credentials/web-app," "organization-unique-name-credentials/app-two," etc. Ensure the proper access rights to each for your existing AWS users.

3. Set encryption on the bucket (you can use either AWS' key or your own).

4. Create a file called credentials.json that looks like this:

{
    "DB_URL" : "some-database-connection-string.com",
    "DB_PORT" : 3306,
    "DB_USER" : "app_user",
    "DB_PASS" : "securepass"
}

5. Upload the file to the right S3 bucket and folder (be sure to enable encryption or the upload will fail, assuming you required it for the bucket)

6. Create an IAM role for your instance. In the IAM console, click "Roles," then create a new role, enter a name, select "EC2 Service Role," and give it the following policy (add any other rights the app may need if it accesses other AWS resources):

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject"
      ],
      "Resource": [
        "arn:aws:s3:::organization-unique-name-credentials/web-app/credentials.json"
      ]
    }
  ]
}

7. Launch your EC2 instance, selecting the role you just created.

8. In your code, do the following (node pseudo-code):

var AWS = require('aws-sdk);
AWS.config.region = 'us-east-1';
var s3 = new AWS.S3();

var params = {
    Bucket: 'organization-unique-name-credentials',
    Key: 'web-app/credentials.json'
}

s3.getObject(params, function(err, data) {
    if (err) {
        console.log(err);
    } else {
        data = JSON.parse(data.Body.toString());
        for (i in data) {
            console.log('Setting environment variable: ' + i);
            process.env[i] = data[i];
        }

        // Load database via db.conn({user:process.env['DB_USER'], password:process.env[
'DB_PASS']}); etc...
    }
});

9. Run the app, and you will notice that the environment variables are downloaded from S3 and are set before the database connection is attempted.

If you're using Node.js, I made a module that does exactly this: https://www.npmjs.com/package/secure-credentials

If you're not using Node.js, this technique can be applied to any language that AWS has an SDK for. While it isn't 100% hacker-proof (if someone managed to log into your instance as root, he or she could still modify the source code to display the credentials), but combined with other AWS security mechanisms such as security groups, VPCs with proper network ACLs, etc. it can certainly help. Additionally, it keeps credentials out of the source code.

One final note: if you're running this app locally to test, your machine will obviously not have an EC2 IAM role. When testing locally, it's okay to use AWS keys and secrets, but be sure to keep them in a separate file that is excluded with .gitignore.

Tuesday, January 13, 2015

AWS Cross-Account IAM Roles in CloudFormation

The AWS documentation is relatively sparse when it comes to creating specific IAM role types using CloudFormation. It describes the process of setting up standard roles, attaching roles to instances, etc. but doesn't mention that all of the other role types can also be created using CloudFormation.

For example, when you log into the AWS console and click on "IAM," you see a number of different roles you can create:

AWS Service Roles
Role for Cross-Account Access
Role for Identity Provider Access

However, these role types are merely just different adaptations of the same concept. In the following steps, I'll show how to create a Cross-Account Role using CloudFormation.

1. Add the following to the "Resources" section of your CloudFormation template:

"CrossAccountRole" : {
"Type" : "AWS::IAM::Role",
"Properties" : {
"AssumeRolePolicyDocument" : {
"Statement" : [
{
"Effect" : "Allow",
"Principal" : {
"AWS": "arn:aws:iam::ACCOUNT_NUMBER_HERE:root"
},
"Action" : [
"sts:AssumeRole"
]
}
]
}
}
},

2. Add another resource for the policy:

"CrossAccountPolicy" : {
"Type" : "AWS::IAM::Policy",
"Properties" : {
"PolicyName" : "IAMInstancePolicy",
"PolicyDocument" : {
"Statement" : [
{
"Effect" : "Allow",
"Action" : [
"*"
],
"Resource" : [
"*"
]
}
]
},
"Roles" : [
{ "Ref" : "CrossAccountRole" }
]
}
},

3. Adjust the account number and resources as needed:

This policy gives admin access to any account you specify. To restrict permissions, change the statement section of the policy document as desired.

Monday, January 5, 2015

Quickly Find What is Using Disk Space on Linux

Here's a quick command to find out what folder is consuming the most space on Linux. This will also sort the results to show the most space-consuming folders at the bottom:

du -sh * | sort -h

Run this in the current directory to find the most consuming subdirectory.

Thursday, October 30, 2014

Node.js "Error: too many parameters at queryparse"

If you've been using Express' body-parser and attempting to process large data submissions (aka forms with more than 1000 elements or 1000 sub-elements, you may have run into the following error:

Error: too many parameters
    at queryparse (/project/node_modules/body-parser/lib/types/urlencoded.js:120:17)
    at parse (/project/node_modules/body-parser/lib/types/urlencoded.js:64:9)

This is due to the fact the urlencode defaults to 1000 parameters by default. If you have a large form or just an abnormally large JSON submission, you'll need to increase this limit by doing the following:

var bodyParser = require('body-parser');

app.use(bodyParser.urlencoded({
        extended: false,
    parameterLimit: 10000,
    limit: 1024 * 1024 * 10
}));
app.use(bodyParser.json({
        extended: false,
    parameterLimit: 10000,
    limit: 1024 * 1024 * 10
}));


This will allow you to provide up to 10,000 parameters (increase as needed) and 10 MB of data (also adjustable).

Tuesday, October 14, 2014

How to Disable SSLv3 on AWS Elastic Load Balancers

In a blog post today, Google announced that a vulnerability in SSLv3 had been found that could allow attackers to intercept data that had previously been assumed to be secured. Luckily, a very small portion of the web (IE6 users on Windows XP) still use SSLv3, so it can safely, for the most part, be disabled to mitigate the risk from this issue.

http://googleonlinesecurity.blogspot.com/2014/10/this-poodle-bites-exploiting-ssl-30.html

UPDATE 10/15: As Andrew and Julio point out in the comments below, AWS has since updated their default cipher security policies. Replace steps 5 and 6.

To modify the ciphers on AWS ELBs, follow the following steps:

1) Log into the AWS console and click on "Load Balancers."
2) Find the load balancer that handles your site's traffic (you shouldn't need to worry about internal VPC LBs, etc.)
3) Click the "Listeners" tab
4) Find the HTTPS/443 listener and click "Edit" under the cipher column
5) Change the option to "Custom"
6) Uncheck the SSLv3 option
5) Change the policy to "ELBSecurityPolicy-2014-10" which disables SSLv3 for you.
6) Save.

This should be sufficient to mitigate this risk with the information that is currently known.