Basic Things About ECS & K8s
AWS ECS, Kubernetes & Their Components
📦 AWS ECS Basic Information
AWS ECS is the Docker-compatible container orchestration solution from Amazon. It allows us to run containerized applications on EC2 instances and scale both of them.
ECS Clusters consist of Tasks which run in Docker containers, and container instances, among many other components.
AWS Services Commonly Used with ECS:
- Elastic Load Balancer: This component can route traffic to containers. 3 kinds of load balancing are available: application, network, and classic.
- Elastic Block Store: This service provides persistent block storage for ECS tasks (workloads running in containers).
- CloudWatch: This service collects metrics from ECS. Based on CloudWatch metrics, ECS services can be scaled up or down.
- Virtual Private Cloud: An ECS cluster runs within a VPC. A VPC can have one or more subnets.
- CloudTrail: This service can log ECS API calls. Details captured include type of request made to Amazon ECS, source IP address, user details, etc.
- Elastic File System (EFS): One can use an EFS file system to mount volumes across the instances running under an ECS Cluster to have logs, as an example, in one place.
Note: An Amazon EFS File System can only have mount targets in one VPC at a time.
ECS Components:
ECS, which is provided by Amazon as a service, is composed of multiple built-in components which enable us to create clusters, tasks, and services:
- State Engine: A container environment can consist of many EC2 container instances and containers. With hundreds or thousands of containers, it is necessary to keep track of the availability of instances to serve new requests based on CPU, memory, load balancing, and other characteristics. The state engine is designed to keep track of available hosts, running containers, and other functions of a cluster manager.
- Schedulers: These components use information from the state engine to place containers in the optimal EC2 container instances. The batch job scheduler is used for tasks that run for a short period of time. The service scheduler is used for long-running apps. It can automatically schedule new tasks to an ELB.
- Cluster: This is a logical placement boundary for a set of EC2 container instances within an AWS region. A cluster can span multiple availability zones (AZs), and can be scaled up/down dynamically. A dev/test environment may have 2 clusters: 1 each for production and test.
- Tasks: A task is a unit of work. Task definitions, written in JSON, specify containers that should be co-located (on an EC2 container instance). Though tasks usually consist of a single container, they can also contain multiple containers.
- Services: This component specifies how many tasks should be running across a given cluster. You can interact with services using their API, and use the service scheduler for task placement.
Note: ECS only manages ECS container workloads – resulting in vendor lock-in. There's no support to run containers on infrastructure outside of EC2, including physical infrastructure or other clouds such as Google Cloud Platform and Microsoft Azure. The advantage, of course, is the ability to work with all the other AWS services like Elastic Load Balancers, CloudTrail, CloudWatch, etc.
☸️ Kubernetes Basic Information
Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.
Kubernetes was introduced in 2014 as an open-source version of the internal Google orchestrator Borg. 2017 saw an increase in Kubernetes adoption by enterprises, and by 2018, it had become widely adopted across diverse businesses, from software developers to airline companies.
One of the reasons why Kubernetes gained popularity so fast is its open-source architecture and an incredible number of manuals, articles, and support provided by its loyal community.
There are a number of components associated with a Kubernetes cluster. The master node places container workloads in user pods on worker nodes or itself.
Kubernetes Components:
- Etcd: This component stores configuration data which can be accessed by the Kubernetes master's API Server via simple HTTP or JSON API.
- API Server: This component is the management hub for the Kubernetes master node. It facilitates communication between the various components, thereby maintaining cluster health.
- Controller Manager: This component ensures that the cluster's desired state matches the current state by scaling workloads up and down.
- Scheduler: This component places the workload on the appropriate node.
- Kubelet: This component receives pod specifications from the API Server and manages pods running in the host.
- Worker Node(s): A node is a machine – physical or virtual. A node is a worker machine and this is where containers are hosted. It was also known as Minions in the past. It can run multiple PODs depending upon the instance configuration.
- Cluster: A cluster is a set of nodes grouped together. This way, even if one node fails, your application is still accessible from the other nodes. Moreover, having multiple nodes helps in sharing load as well.
- Master Node: The master is another node with Kubernetes installed in it, and is configured as a Master. The master watches over the nodes in the cluster and is responsible for the actual orchestration of containers on the worker nodes.
- Container Runtime Engine: The container runtime is the underlying software that is used to run containers. In our case, it happens to be Docker.
- Kubelet: Kubelet is the agent that runs on each node in the cluster. The agent is responsible for making sure that the containers are running on the nodes as expected.
- Kube-Proxy: An additional component on the Node is the kube-proxy. It takes care of networking within Kubernetes.
Containers Within a Pod Can Interact in Various Ways:
- Network: Containers can access any listening ports on containers within the same pod, even if those ports are not exposed outside the pod.
- Shared Storage Volumes: Containers in the same pod can be given the same mounted storage volumes, which allows them to interact with the same files.
- SharedProcessNamespace: Process namespace sharing can be enabled by setting shareProcessNamespace in the pod spec. This allows containers within the pod to interact with, and signal, one another's processes.
📚 Common Kubernetes Terms
- Pods: Kubernetes deploys and schedules containers in groups called pods. Containers in a pod run on the same node and share resources such as filesystems, kernel namespaces, and an IP address.
- Deployments: These building blocks can be used to create and manage a group of pods. Deployments can be used with a service tier for scaling horizontally or ensuring availability.
- Services: An abstraction layer which provides network access to a dynamic, logical set of pods. These are endpoints that can be addressed by name and can be connected to pods using label selectors. The service will automatically round-robin requests between pods. Kubernetes will set up a DNS server for the cluster that watches for new services and allows them to be addressed by name. Services are the "external face" of your container workloads.
- Labels: These are key-value pairs attached to objects. They can be used to search and update multiple objects as a single set.
Example: Labels are properties attached to each item. So you add properties to each item for their class, kind, and color.
- Selectors: Labels and Selectors are a standard method to group things together.
Example: Let's say you have a set of different species. A user wants to be able to filter them based on different criteria using labels based on the color, type, etc., and wants to retrieve those filtered items – this is where Selectors come into the picture.
🔑 Important Terminologies in Kubernetes
Ingress
Ingress is actually NOT a type of service. Instead, it sits in front of multiple services and acts as a smart router or entry point into your cluster and comes under Network Policies.
ReplicaSets
ReplicaSets: It is one of the Kubernetes controllers used to make sure that we have a specified number of pod replicas running. (A controller in Kubernetes is what takes care of tasks to make sure the desired state of the cluster matches the observed state).
Secrets
Secrets: Secrets are used to store sensitive information, like passwords or keys. They are similar to ConfigMaps, except that they are stored in an encoded or hashed format.
StatefulSets
StatefulSets (different ways to deploy your application): StatefulSet is also a Controller but unlike Deployments, it doesn't create ReplicaSet; rather, it creates the Pod with a unique naming convention. For example, if you create a StatefulSet with name "counter", it will create a pod with name "counter-0", and for multiple replicas of a StatefulSet, their names will increment like counter-0, counter-1, counter-2, etc. Every replica of a StatefulSet will have its own state, and each of the pods will be creating its own PVC (Persistent Volume Claim). So a StatefulSet with 3 replicas will create 3 pods, each having its own Volume, so total 3 PVCs.
Services
Services: A service creates an abstraction layer on top of a set of Replica PODs. You can access the Service rather than accessing the PODs directly, so as PODs come and go, you get uninterrupted, dynamic access to whatever replicas are up at that time.
Service Types:
- NodePort: Service is exposed externally on a listening port on each node in the cluster.
- LoadBalancer: Service is exposed via a load balancer created on a cloud platform.
Note: The cluster must be set to work with a cloud provider in order to use this option.
- ExternalName: Service does not proxy traffic to pods, but simply provides DNS lookup for an external address.
This allows components within the cluster to lookup external resources in the same way they look up internal ones: through services.
ConfigMaps
ConfigMaps: ConfigMaps are used to pass configuration data in the form of key-value pairs in Kubernetes (stores configuration data in plain text). When a POD is created, inject the ConfigMap into the POD, so the key-value pairs are available as environment variables for the application hosted inside the container in the POD.
Deployments
Deployments (different ways to deploy your application): Deployment is the easiest and most used resource for deploying your application. It is a Kubernetes controller that matches the current state of your cluster to the desired state mentioned in the Deployment manifest. For example, if you create a deployment with 1 replica, it will check that the desired state of ReplicaSet is 1 and current state is 0, so it will create a ReplicaSet, which will further create the pod. If you create a deployment with name "counter", it will create a ReplicaSet with name "counter-[random-string]".
Containers
Containers: In our case, it's a Docker image which will run as a container.
Persistent Volumes
Persistent Volumes: A Persistent Volume is a cluster-wide pool of storage volumes configured to be used by users deploying applications on the cluster. The users can now select storage from this pool using Persistent Volume Claims. Kubernetes persistent volumes remain available outside of the pod lifecycle → this means that the volume will remain even after the pod is deleted.
Taints & Tolerations
Taints & Tolerations: Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints. Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints.
🌐 Network Policies
Solutions that Support Network Policies:
- Calico
- Romana
- Weave, Cilium, and Kube-router
Solutions that Do Not Support Network Policies:
- Flannel
🔄 Common Design Patterns for Multi-Container Pods
1. Ambassador Pattern
An HAProxy ambassador container receives network traffic and forwards it to the main container.
Example: An ambassador container listens on a custom port and forwards the traffic to the main container's hard-coded port. A concrete example would be a ConfigMap storing the HAProxy config. HAProxy will listen on port 80 and forward the traffic to the main container, which is hard-coded to listen on any given port number.
2. Sidecar Pattern
A sidecar container enhances the main container in some way, adding functionality to it.
Example: A sidecar periodically syncs files in a webserver container's file system from a Git repository.
3. Adapter Pattern
An adapter container transforms the output of the main container.
Example: An adapter container reads log output from the main container and transforms it.