Docker containers have now become mainstream. Containers provide advantages such as consistency in building applications which make them attractive for bundling and deploying web applications on Clouds. Such has been the momentum around containers that it is being proclaimed that containers and Container-as-a-Service (CaaS) offerings will supersede Platform-as-a-Service (PaaS) systems .
We decided to evaluate this claim for ourselves through experimentation and analysis of AWS Elastic Beanstalk (a PaaS system) and Amazon EC2 Container Service (ECS – a CaaS system). Read on for the details of the experiment and our learnings from it, including the answer to the question – Is it time to replace PaaS with containers and CaaS?
AWS Elastic Beanstalk is Amazon’s Platform-as-a-Service offering. Its promise is that application developers need to only write application code and Beanstalk will take care of provisioning infrastructure resources required to run the application. It was released in 2011  and is available in all AWS regions.
AWS ECS is Amazon’s Container-as-a-Service offering. Its promise is that application developers need to only provide application in the form of Docker containers and it will take care of managing infrastructure resources required to run such containers. ECS was released in 2014 . Compared to Elastic Beanstalk, ECS is not yet available in following regions: Sao Paulo, GovCloud, Seoul, and Mumbai (as of July 10 2017).
Documentation of both services is quite comprehensive. Both are free to use; you pay only for the AWS resources (such as EC2 instances) that get provisioned for running your application.
For the experiment we used two Python Flask web applications – one stateless and one stateful. The stateful application required a MySQL database for managing application state. For database, we wanted to use Amazon’s native database service (Relational Database Service – RDS) with MySQL engine. The stateful application was written with standard Python MySQL libraries (mysql-connector) to interact with the database. We used us-west-2 AWS region for the experiment.
Elastic Beanstalk has support for deploying Docker-based applications. We decided to use the simplest Beanstalk Docker configuration that would make for a fair comparison. This criteria led us to choose the Preconfigured Docker platform configuration (Python 3.4 with uWSGI 2 (Docker) version 2.7.0) for our Beanstalk experiments. In ECS we used a cluster consisting of a single EC2 instance. We ran experiment with each application deployed separately. This was primarily done so as to keep application port configuration on ECS cluster to minimum complexity. Essentially, this allowed us to use static port mapping instead of dynamic port mapping for containers on the ECS cluster. We used AWS EC2 Container Registry (ECR) for storing application containers. For accessing required AWS services we used combination of AWS CLI, Elastic Beanstalk CLI, ECS CLI and the AWS Python SDK (boto3). We conducted the experiment using a Lenovo ThinkPad with Intel Core i5-3320M CPU@ 2.60Hz x 4 with 16 GB memory and a standard ISP Internet connection. The OS on the machine was Ubuntu 14.04. Docker version that we used was 1.6.2, build 7c8fca2. This version, while old, has been stable for our needs. Having said that, we also ran few of the Elastic Beanstalk experiments on another Machine (a Apple Laptop) with Docker version 17.03.1-ce, build c6d412e, which was the latest version at that time. We did not see any differences in the final results with these two Docker versions.
The experiment was driven with the goal of answering following questions:
- Apart from the two services (Elastic Beanstalk and ECS), knowledge of which other AWS services is required to deploy web applications on AWS? For such services, what is the extent of the knowledge required?
- What kind of support is available in each service for local testing of application before deployment to AWS?
- From application deployment perspective, what is the level of deployment consistency offered by each service?
- Knowledge of other AWS services: For stateless application, knowledge of no other AWS service is required when using Elastic Beanstalk. On the other hand, when using ECS you will optionally need to know about following AWS services as well: AWS Elastic Load Balancer (ELB) and AWS Identity and Access Management (IAM). When a web application is deployed on ECS, the URL of the application is essentially the IP address/DNS name of the EC2 instance of the cluster on which the container is running. If you want to run multiple containers, you cannot use EC2 instance’s IP/DNS as application URL. You have to provision an instance of AWS ELB (either v1 or v2) and set it up with appropriate target group that routes traffic to the containers. You will also have to grant appropriate IAM role to the ECS service which allows it update the load balancer with the addresses of running containers. Application URL in this case is the DNS name of this load balancer. For stateful application, deeper knowledge of various AWS services is required when using ECS as compared to Beanstalk. When using Elastic Beanstalk, all one has to do is provide a command line flag to create a RDS instance at the time of Beanstalk environment creation. This flag directs Beanstalk to not only create a RDS instance, but also set it up securely to be accessed from applications in that environment. With ECS, there is no built-in support for deploying RDS. So one needs to know how to provision a RDS instance. One needs to also figure out how to make application containers securely connect to it. This requires understanding of additional AWS services and concepts such as AWS Virtual Private Cloud (VPC), security-groups, subnets, and CIDR blocks. Essentially, you have to provision the RDS instance in the same VPC as that of the ECS cluster. And you need to setup appropriate security groups for RDS that allow traffic from the CIDR block of the VPC. Not only that, if your application code is written using AWS SDK to access RDS, then you need to grant appropriate IAM roles to application tasks (running containers) in ECS to access RDS. This is also true for any other service that your application might be accessing using AWS SDK, such as S3, DynamoDB, etc.
- Support for local testing: One of the key things in developing web applications is a fast testing loop. Application developers want to make changes to the code and test them locally on their own machine. Both Elastic Beanstalk and ECS have support for local testing of applications, but they differ in their ease of use. Beanstalk CLI provides specialized command (eb local) that builds and runs application containers locally. This feature only works for applications that are being built as containers (that is, those applications for which a Dockerfile is defined in the application directory and which will be deployed using one of the Docker platform configurations on Beanstalk). It will not work for applications that are not built as containers. Thus, this feature is not available for non-containerized applications in Elastic Beanstalk. For ECS, local testing is non-existent, but one can directly use docker commands (docker build, docker run) to test the application locally. For stateful application, we need a MySQL database. In local testing we have three options for provisioning such a database. We can use a MySQL server locally or use a MySQL container (provisioned using docker-compose) or provision a RDS instance and make it accessible to your local application. All three options exist when using either Beanstalk or ECS. You will need to provision MySQL instance first and then pass its connection parameters through environment variables in Dockerfile.
- Deployment consistency: According to DevOps principles of Continuous Integration/Continuous Delivery, it is important that the version of the application that is eventually deployed is the same one that was tested in the continuous integration system such as Jenkins. Elastic Beanstalk and ECS differ significantly with respect to deploying pre-built application artifacts. ECS supports this requirement by essentially not doing anything with regards to building application containers. The container needs to be built outside — CI workflow is a perfect place to build and test such a container, and then push it to ECR. From audit tracking perspective  this is very attractive – we know that the deployed container image is exactly same as the one that was built and tested in a controlled environment. With Elastic Beanstalk the situation is not that simple. Only the Multicontainer Docker platform configuration  supports custom Docker images, similar to the ECS workflow. But support for custom images is limited to base images for building application containers in other Docker platform configurations (Preconfigured Docker and Single container Docker). When using these configurations, Beanstalk still needs to build the final application container image. Thus, when using these platform configurations it is not possible to get confidence that the application that eventually runs in Beanstalk is constructed from the code that was tested in your CI system.
- Performance: Performance of both the services is comparable. With ECS, one would expect that subsequent deployments of a given application will be faster as compared to the first deployment because of the locally cached container images. But we did not find that to be the case. Caching does help with speeding up container build time, but we observed that application deployment is dominated by the time taken to push the container image to ECR. This does not seemed to change much between the first and subsequent deployments. Provisioning of RDS takes same amount of time when done directly or when triggered through deployment of Elastic Beanstalk environment creation.
- CLI and SDK completeness: For ECS, features available in the SDK seem to be incomplete as compared to those available in the CLI. For instance, in the ECS CLI there is a command “ecs-cli up” which creates the ECS cluster and adds specified number of EC2 instances to the cluster. The corresponding boto3 API call “create_cluster” , just creates the cluster without adding any EC2 instances to it. There is another call “register_container_instance” , to supposedly register a EC2 instance with a cluster. However, this API is marked as internal. So it is not clear how to create a cluster with specified number of EC2 instances using the SDK. The boto3 SDK seems to be incomplete for this purpose.
Through this experiment we learnt that there are different aspects in which these two systems excel. Elastic Beanstalk excels in abstracting most of the infrastructure resources required by an application. This is not true when using ECS. Application developers need to know about other AWS services besides ECS such as ELB, IAM, RDS, VPC. On the other hand, ECS’s main focus is to run externally built application containers. With Elastic Beanstalk this is possible only in the Multicontainer Docker platform configuration. Along with the other Docker platform configurations available in Beanstalk, it can get confusing to figure out which configuration is appropriate for your needs.
Containers will continue to be attractive to application developers and Operations teams due to their build-once run-anywhere characteristic. But just containers on their own do not address all the aspects that application developers need to consider when deploying their applications to cloud. It is prudent to keep in mind that containers and CaaS systems are still at the infrastructure layer and deep infrastructure-level understanding is required to correctly deploy and configure containerized applications on cloud when using such systems.