This Guidance demonstrates how to provision EKS clusters, making it easier for customers to adopt Amazon Elastic Kubernetes Service (Amazon EKS) with required operational software. Use this Guidance with EKS Blueprints for Terraform Infrastructure as Code to quickly deploy and operate containerized workloads using a GitOps approach. 

Architecture Diagram

  • Main Architecture
  • Please note: This is the main architecture. For add-on steps, open the other tabs.

  • New Amazon VPC
  • Please note: This is an add-on to the main architecture.

    Download the architecture diagram PDF 
  • Argo CD Add-on
  • Please note: This is an add-on to the main architecture.

    Download the architecture diagram PDF 
  • Apache Spark Add-on
  • Please note: This is an add-on to the main architecture.

    Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • This Guidance uses services that allow full visibility through monitoring and logging, providing businesses with reliable, stable, and dependable applications. Administrator/DevOps team users receive alerts from metrics defined in this Guidance to monitor the health of the workloads and minimize the impact from incidents. 

    The Implement Feedback Loops document describes how feedback loops are set up and how they work. 

    In the unlikely, yet possible event of a complete AWS Region outage, the Guidance can be redeployed to another AWS Region with minor changes to the Terraform modules configuration.

    Currently, Guidance deployment/configuration change workflows are initiated manually, in the future they can be automated using GitHub repository push events when IaC code is updated.

    Read the Operational Excellence whitepaper 
  • The Principle of least privilege is applied throughout this Guidance, limiting each resource’s access to only what is required. All resources in this Guidance, including data storage volumes, are deployed into private subnets. The EKS cluster is deployed in a separate VPC and can be accessed only through designated and protected endpoints front-ended by load balancers. External access to the EKS cluster API endpoint is enforced through HTTPS traffic; SSL certificates are attached to the ingress controllers. Pod security and network policies are used to allow or disallow specific traffic between pods or services running in different Kubernetes namespaces. The Amazon EKS managed control plane is an important resource to protect and is secured in its own VPC. As core system and application data is stored on Amazon EBS, their security is assured by the overall infrastructure security on AWS as well as EKS cluster security outlined in best practices guides.

    EKS Blueprints generate IAM Roles for Service Accounts (IRSA) and policies on the EKS cluster. AWS IAM Authenticator for Kubernetes uses a webhook that is used to validate caller identities. When aws-auth ConfigMap is deployed, it enables the AWS IAM Authenticator to validate the IAM identity (role) that is mapped to service accounts, providing a fine-grained access control to cloud native applications. 

    As with any Kubernetes-based solution, it is highly recommended to apply AWS recommended security patches to EKS clusters. 

    Read the Security whitepaper 
  • To support reliable workloads, application-level add-ons are deployed by this Guidance in the EKS cluster. Kubernetes microservices provide the advantages of loosely coupled dependencies, assurance of the required number of replicas, and service level internal and ingress-based external load balancing.

    Through the spread of private subnets in different Availability Zones, if compute node in one Availability Zone collapses, the Autoscaling group attached to the Amazon EKS cluster would spin up a replacement instance in another healthy Availability Zone. 

    Because components logs and metrics are key resources for troubleshooting the Guidance, Kubernetes (also called K8s) authenticator and scheduler logs from the control plane are implemented with CloudWatch. Additional log event integration for external systems is available via a FluentBit add-on. Metric-based monitoring and alerting are available through Prometheus add-on. If any failure events occur, alerts are delivered to administrators and/or DevOps team through various notification channels to avoid undetected issues.

    Read the Reliability whitepaper 
  • Amazon EKS is a native service. This Guidance focuses on cost-efficient ways to deploy and configure it with selected resources so that users can achieve a reliable Kubernetes application platform with high availability and low operational costs. Optimization can be performed based on CPU and memory usage metrics as well as network traffic, input/output operations per second (IOPS), and other metrics. With this Guidance, users are capable of provisioning EKS clusters with customizable resource parameters adjusted for optimal performance automatically with minimal time. 

    Amazon EKS architecture is spanned across multiple Availability Zones in order to get highly available architecture. While some traffic will exist between subnets deployed into Availability Zones, its latency should not make any significant performance impact.

    Read the Performance Efficiency whitepaper 
  • Automation and scalability are cost-saving features this Guidance utilizes with Terraform and AWS Node auto-scaling groups. A centralized administration solution is implemented through the AWS Console, AWS Command Line Interface (AWS CLI), and with the Argo CD GitOps add-on. These features allow for early detection and correction of defects in the design process, which reduces total costs of development efforts and schedule overruns. 

    Because of the highly configurable autoscaling minimum, maximum, and desired number of compute nodes, as well as their Amazon Elastic Compute Cloud (Amazon EC2) parameters, resources are managed efficiently. There are related metrics and alerts that provide insights into AWS resource utilization available to administrators/DevOps teams. 

    A significant factor for data transfer costs within EKS Kubernetes clusters are calls to Kubernetes services from external clients going via Application Load Balancers. The data transfer costs when calling services are mapped to communications between pods running in different AWS Availability Zones.

    Read the Cost Optimization whitepaper 
  • This Guidance provisions and deploys workloads on an Amazon EKS cluster located in the AWS Cloud - there is no need to procure any physical hardware. Capacity providers and autoscaling groups keep virtual “hardware” provisioning to a minimum, along with minimal necessary adjustment to scaling events, should the workloads demand it.

    Every pod running on the Kubernetes platform, including the EKS cluster, will consume memory, CPU, I/O and other resources. With performance driven auto-scaling enabled on the platform (via cluster auto-scaler) and application (including the host protected area), levels in both directions of resource utilization and resource consumption update automatically. EKS cluster administrators/DevOps teams can monitor resource utilization through metrics and events and perform direct configuration updates where needed. 

    Data is accessed through Kubernetes services that are exposed with secure endpoints.  Using Amazon EBS, CSI drivers and related storage classes assures loosely coupled, scalable, and efficient data access patterns.

    Read the Sustainability whitepaper 

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

AWS Architecture
Blog

Title

Subtitle
Text.
 
This post demonstrates how...
Read the full blog post  Learn more about this AWS Solution 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.