The Ultimate AWS ECS Guide: From Setup to CI/CD Pipeline

Deploying AWS ECS: Full Infrastructure Tutorial Guide

🚀 AWS ECS Infrastructure

Complete Tutorial Guide – From Beginner to Advanced

📚 AWS ECS Tutorial Overview

This section covers the concepts, services, and components you’ll be working with. Read through this section first to understand what each resource does before you start building.

Best for: Beginners who want to understand AWS ECS infrastructure before implementation

Architecture Overview

🌐 Internet
Application Load Balancer
Public Subnets (AZ-A, AZ-B)
Frontend ECS Service
Private Subnets (Port 80)
Backend ECS Service
Private Subnets (Port 3001)
RDS PostgreSQL
Private Subnets (Port 5432)

Traffic Flow

  • Inbound: Internet → ALB → ECS Tasks (Frontend/Backend)
  • Outbound: ECS Tasks → NAT Gateway → Internet (for pulling images, API calls)
  • Internal: ECS Tasks → RDS Database (via Security Groups)
Frontend application (example UI)
Frontend application (example)
1

Virtual Private Cloud (VPC)

Network Foundation
VPC overview
VPC overview
Public and private subnets
Subnets (public/private)
Route tables
Route tables
NAT gateway
NAT Gateway

📖 Understanding VPC

A Virtual Private Cloud (VPC) is your own isolated network environment within AWS. Think of it as your private data center in the cloud where you have complete control over:

  • IP Address Ranges: Define your own CIDR blocks (e.g., 10.0.0.0/16)
  • Subnets: Divide your network into public (internet-facing) and private (internal) subnets
  • Routing: Control how traffic flows between subnets and to the internet
  • Security: Implement network-level security with security groups and network ACLs

Without a VPC, your resources would be exposed to the public internet with no network isolation. The VPC keeps your ECS tasks, RDS database, and load balancer in a protected network, only accessible through controlled entry points.

What Gets Created:

  • VPC with DNS hostnames enabled
  • Internet Gateway (for public subnets)
  • Public Subnets (for ALB and NAT Gateways)
  • Private Subnets (for ECS tasks and RDS)
  • NAT Gateways (for outbound internet access from private subnets)
  • Route Tables (directing traffic appropriately)
2

Security Groups

Virtual Firewalls
Security group rules example
Security groups (inbound/outbound rules)

📖 Understanding Security Groups

Security Groups act as virtual firewalls that control inbound and outbound traffic for your AWS resources. Here’s what makes them useful:

  • Stateful: If you allow inbound traffic, the response is automatically allowed outbound
  • Resource-Level: Attached to network interfaces (EC2, RDS, ALB, ECS tasks)
  • Rule-Based: Define rules based on protocol, port, and source/destination
  • Default Deny: All traffic is denied by default unless explicitly allowed

Security groups are your first line of defense. They ensure only authorized traffic reaches your resources – think of them as bouncers at a club, checking IDs before letting anyone in.

You’ll create three security groups:

  • ALB-SG: Allows HTTP/HTTPS from internet (0.0.0.0/0)
  • ECS-SG: Allows ports 80 and 3001 from ALB-SG only
  • RDS-SG: Allows PostgreSQL (port 5432) from ECS-SG only
3

IAM Roles

Access Control & Permissions
ECS task execution role
ECS Task Execution Role
Frontend task role
Frontend Task Role
Backend task role
Backend Task Role

📖 Understanding IAM Roles

IAM (Identity and Access Management) Roles are AWS identities with permission policies that define what actions can be performed. For ECS, you’ll work with two types:

  • Task Execution Role: Used by ECS infrastructure to pull Docker images from ECR, write logs to CloudWatch, and read secrets from Parameter Store. This is required – without it, ECS can’t pull your images.
  • Task Role: Used by your application code running inside containers to access AWS services (S3, DynamoDB, etc.). This is optional but recommended if your app needs AWS API access.

Roles follow the principle of least privilege – each component only gets the permissions it needs. This keeps your infrastructure secure and makes troubleshooting easier.

You’ll create three roles:

  • ecsTaskExecutionRole: For ECS to pull images, write logs, read parameters
  • frontendTaskRole: For frontend application (optional CloudWatch permissions)
  • backendTaskRole: For backend application (RDS, S3 access if needed)
4

Elastic Container Registry (ECR)

Docker Image Storage
ECR repositories and images
ECR repositories (frontend-repo / backend-repo)

📖 Understanding ECR

Amazon Elastic Container Registry (ECR) is a fully managed Docker container registry that stores, manages, and deploys Docker container images. It’s integrated with ECS, so pushing images and having ECS pull them is seamless.

Think of ECR as a private Docker Hub for your AWS account. When you build your application, you push the image to ECR, and ECS pulls it automatically when starting tasks. This keeps your images secure and close to where they’re used.

You’ll create two repositories:

  • frontend-repo: Stores frontend Docker images
  • backend-repo: Stores backend Docker images
5

RDS PostgreSQL Database

Managed Database Service

📖 Understanding RDS

Amazon Relational Database Service (RDS) is a managed database service that makes it easy to set up, operate, and scale relational databases in the cloud. Instead of managing database servers yourself, RDS handles:

  • Provisioning: Automated database setup
  • Backups: Automated daily backups with point-in-time recovery
  • Scaling: Easy vertical scaling (instance size) and read replicas
  • Maintenance: Automated patching and updates

Your backend application needs a database to store data. RDS provides a managed PostgreSQL database that’s secure, scalable, and reliable – no need to worry about database server management.

6

Amazon ECS (Elastic Container Service)

Container Orchestration Platform
ECS cluster overview
ECS Cluster (services and tasks)

📖 Understanding Amazon ECS

Amazon Elastic Container Service (ECS) is a fully managed container orchestration service that makes it easy to deploy, manage, and scale containerized applications on AWS. It’s AWS’s native container orchestration platform, designed specifically for AWS infrastructure.

Core Components:

  • Cluster: A logical grouping of tasks or services. Think of it as a container for your containerized applications.
  • Task Definition: A blueprint that describes how to run your container (CPU, memory, image, ports, environment variables, etc.)
  • Task: A running instance of a task definition. This is your actual containerized application running.
  • Service: Maintains a desired number of running tasks simultaneously. Handles load balancing, auto-scaling, and health monitoring.
  • Container: A lightweight, portable package containing your application and its dependencies.

Launch Types:

  • Fargate (Serverless): No EC2 instances to manage. AWS handles all infrastructure. Pay only for running containers.
  • EC2: You manage EC2 instances. More control but more management overhead.

What Makes ECS Powerful:

  • Fully Managed: AWS handles infrastructure provisioning, scaling, and patching
  • Integrated: Works seamlessly with ALB, CloudWatch, ECR, IAM, and other AWS services
  • Scalable: Automatically scales based on demand or manual configuration
  • Secure: Built-in security features, IAM integration, VPC networking
  • Cost-Effective: Pay only for what you use (Fargate) or optimize EC2 costs

🔄 ECS vs Other AWS Services – Comparison Table

ServiceUse CaseInfrastructureScalingBest For
ECS FargateContainerized applicationsServerless (AWS managed)Automatic/ManualMicroservices, web apps, APIs
ECS EC2Containerized applicationsYou manage EC2 instancesAutomatic/ManualWhen you need instance-level control
EKS (Kubernetes)Kubernetes workloadsManaged Kubernetes control planeKubernetes autoscalingComplex orchestration, multi-cloud
LambdaEvent-driven functionsFully serverlessAutomatic (per request)Short-lived functions, event processing
EC2Traditional applicationsYou manage everythingManualLegacy apps, full control needed
App RunnerContainerized web appsFully managedAutomaticSimple deployments, CI/CD built-in

🎯 When to Use ECS?

Use ECS Fargate when:

  • ✅ You want to run containerized applications without managing servers
  • ✅ You need automatic scaling and load balancing
  • ✅ You want to focus on application code, not infrastructure
  • ✅ You need high availability and fault tolerance
  • ✅ You want seamless integration with other AWS services
  • ✅ You’re building microservices architectures
  • ✅ You need to run long-running applications (not just functions)
  • ✅ You want predictable pricing based on CPU/memory

Don’t use ECS when:

  • ❌ You need sub-second cold starts (use Lambda instead)
  • ❌ You need Kubernetes-specific features (use EKS instead)
  • ❌ You’re running very simple, single-container apps (consider App Runner)
  • ❌ You need instance-level OS access and customization (use EC2)
  • ❌ Your workload is purely event-driven with no long-running processes (consider Lambda)

🚀 Deployment Strategies in ECS

ECS supports multiple deployment strategies to ensure zero-downtime updates:

1. Rolling Update (Default)
  • How it works: Gradually replaces old tasks with new ones
  • Process: Starts new tasks → Waits for health checks → Shifts traffic → Stops old tasks
  • Configuration: Minimum healthy percent: 100%, Maximum percent: 200%
  • Best for: Most applications, standard deployments
  • Downtime: Zero downtime if configured correctly
2. Blue/Green Deployment
  • How it works: Creates a completely new set of tasks alongside old ones
  • Process: Deploy new version → Test thoroughly → Switch traffic → Keep old version for rollback
  • Configuration: Use separate services or task definitions
  • Best for: Critical applications, when you need instant rollback
  • Downtime: Zero downtime
3. Canary Deployment
  • How it works: Gradually shifts traffic from old to new version
  • Process: Deploy new version → Route 10% traffic → Monitor → Gradually increase to 100%
  • Configuration: Use ALB weighted target groups or separate services
  • Best for: Testing new versions with real traffic, risk reduction
  • Downtime: Zero downtime
4. Circuit Breaker Pattern
  • How it works: Automatically rolls back if deployment fails
  • Process: Deploy → Monitor health checks → Auto-rollback on failure
  • Configuration: Enable deployment circuit breaker in service
  • Best for: Automated safety, preventing bad deployments
  • Downtime: Prevents downtime from bad deployments

🔒 Security Considerations in ECS

ECS provides multiple layers of security:

1. Network Security
  • VPC Isolation: Run tasks in private subnets, isolated from internet
  • Security Groups: Control inbound/outbound traffic at task level
  • Network ACLs: Additional layer of network security (optional)
  • No Public IPs: Tasks in private subnets don’t need public IPs
2. IAM Security
  • Task Execution Role: Minimal permissions for ECS to pull images, write logs
  • Task Role: Application-level permissions (principle of least privilege)
  • Service-Linked Roles: For ECS service operations
  • No Hardcoded Credentials: Use IAM roles, not access keys
3. Container Security
  • Image Scanning: ECR automatically scans images for vulnerabilities
  • Secrets Management: Use Parameter Store or Secrets Manager (not environment variables)
  • Least Privilege: Run containers with minimal required permissions
  • Image Source: Only pull images from trusted registries (ECR)
4. Runtime Security
  • Logging: All container logs go to CloudWatch (audit trail)
  • Monitoring: CloudWatch metrics for security events
  • Encryption: Encrypt data in transit (TLS) and at rest (EBS encryption)
  • Compliance: ECS supports HIPAA, PCI-DSS, SOC compliance
5. Best Practices
  • ✅ Use private subnets for tasks
  • ✅ Enable VPC Flow Logs for network monitoring
  • ✅ Regularly update container images with security patches
  • ✅ Use secrets management (Parameter Store/Secrets Manager)
  • ✅ Enable CloudTrail for API call auditing
  • ✅ Implement least privilege IAM policies
  • ✅ Use ECR image scanning
  • ✅ Enable encryption for EBS volumes

❓ Technical Q&A – Common ECS Questions

Q1: What’s the difference between ECS Fargate and ECS EC2?

A: Fargate is serverless – AWS manages all infrastructure. EC2 requires you to manage EC2 instances. Fargate is simpler but EC2 gives more control and can be more cost-effective at scale.

Q2: How does ECS handle container failures?

A: ECS automatically restarts failed containers. Services monitor task health and replace unhealthy tasks. Health checks ensure only healthy tasks receive traffic.

Q3: Can I run multiple containers in one task?

A: Yes! A task definition can define multiple containers that share networking and storage. They’re always scheduled together on the same infrastructure.

Q4: How does ECS scale applications?

A: ECS can scale based on CloudWatch metrics (CPU, memory, request count). You configure target tracking policies, and ECS automatically adjusts the number of tasks.

Q5: What happens during a deployment?

A: ECS starts new tasks with the updated image, waits for them to pass health checks, gradually shifts traffic from old to new tasks, then stops old tasks. This ensures zero downtime.

Q6: How do containers communicate with each other?

A: Containers in the same task share networking and can communicate via localhost. Containers in different tasks communicate via service discovery (DNS names) or through the load balancer.

Q7: Can I use ECS with Kubernetes?

A: No, ECS is AWS’s own orchestration platform. If you need Kubernetes, use EKS (Elastic Kubernetes Service). However, both can coexist in the same AWS account.

Q8: How much does ECS cost?

A: ECS itself is free. You pay for:

  • Fargate: CPU and memory resources used
  • EC2: EC2 instance costs
  • ALB: Load balancer hours and data transfer
  • ECR: Storage for container images
  • CloudWatch: Logs and metrics

Q9: Can I run stateful applications on ECS?

A: Yes, but it’s not recommended. ECS is designed for stateless applications. For stateful apps, use EBS volumes for persistent storage, or better yet, use managed services like RDS for databases.

Q10: How do I update my application without downtime?

A: Use rolling updates with proper configuration (minimum healthy percent: 100%, maximum percent: 200%). ECS will start new tasks, wait for health checks, shift traffic, then stop old tasks automatically.

Q11: What’s the difference between a task and a service?

A: A task is a single running container instance. A service maintains a desired number of tasks, handles load balancing, auto-scaling, and ensures tasks are always running.

Q12: Can I use Docker Compose with ECS?

A: Yes! AWS provides ECS Compose CLI that converts Docker Compose files to ECS task definitions. However, native ECS task definitions offer more AWS-specific features.

Q13: How do I handle secrets in ECS?

A: Use AWS Systems Manager Parameter Store or Secrets Manager. Reference secrets in task definitions, and ECS will inject them as environment variables at runtime. Never hardcode secrets!

Q14: What happens if a task runs out of memory?

A: ECS will stop the task and start a new one. The service will maintain the desired count. Monitor CloudWatch metrics to adjust memory allocation in task definitions.

Q15: Can I run Windows containers on ECS?

A: Yes, ECS supports both Linux and Windows containers. However, Windows containers require EC2 launch type (not Fargate) and Windows EC2 instances.

✅ Key Takeaways

  • ECS is AWS’s native container orchestration platform
  • Fargate provides serverless containers – no infrastructure management
  • Great for microservices, web applications, and APIs
  • Supports multiple deployment strategies for zero-downtime updates
  • Built-in security features with VPC, IAM, and encryption
  • Integrates seamlessly with other AWS services
7

ECS Services

Running Applications
ECS services (frontend and backend)
ECS Services (frontend-service / backend-service)

📖 Understanding ECS Services

ECS Service runs and maintains a specified number of instances of a task definition simultaneously. It’s the component that keeps your application running.

What services handle:

  • Desired Count: Maintains specified number of running tasks
  • Load Balancing: Integrates with ALB to distribute traffic
  • Auto Scaling: Can scale based on demand
  • Rolling Updates: Zero-downtime deployments

Services ensure your application is always running. If a task crashes, the service automatically starts a new one – you don’t have to babysit it.

8

ECS Task Definitions

Application Blueprints

📖 Understanding Task Definitions

ECS Task Definition is a blueprint for your application. It describes everything ECS needs to know to run your containers:

  • Container Images: Which Docker images to use
  • Resources: CPU and memory allocation
  • Ports: Which ports to expose
  • Environment Variables: Configuration values
  • Secrets: Secure values from Parameter Store
  • Logging: Where to send container logs

Task definitions tell ECS exactly how to run your containers. You create the definition once, and ECS uses it to launch tasks. Think of it as a recipe – follow it, and you get the same result every time.

📦 Example: Frontend Task Definition (High-Level)

This is a simplified view of what your frontend-task definition looks like in JSON form. It shows the most important parts you’ll care about:

{ “family”: “frontend-task”, “networkMode”: “awsvpc”, “cpu”: “256”, “memory”: “512”, “requiresCompatibilities”: [“FARGATE”], “executionRoleArn”: “arn:aws:iam::123456789012:role/ecsTaskExecutionRole”, “taskRoleArn”: “arn:aws:iam::123456789012:role/ecsTaskRole”, “containerDefinitions”: [ { “name”: “frontend”, “image”: “123456789012.dkr.ecr.us-east-1.amazonaws.com/frontend-repo:latest”, “portMappings”: [ { “containerPort”: 80, “protocol”: “tcp” } ], “environment”: [ { “name”: “NEXT_PUBLIC_API_URL”, “value”: “http://ALB-DNS-NAME/api” } ], “logConfiguration”: { “logDriver”: “awslogs”, “options”: { “awslogs-group”: “/ecs/frontend”, “awslogs-region”: “us-east-1”, “awslogs-stream-prefix”: “ecs” } } } ] }

How to read this:

  • family: Logical name of the task definition (frontend-task)
  • containerDefinitions: List of containers that run together (here only frontend)
  • image: Points to the Docker image in ECR (updated by CI/CD)
  • portMappings: Exposes port 80 from the container to the load balancer
  • environment: Plain configuration values (non-secret)
  • logConfiguration: Sends logs to CloudWatch Logs group /ecs/frontend

This example matches what you’ll actually create in Tab 2 (Step 9: Create ECS Task Definitions), but here it’s shown in a simplified, conceptual way.

9

Application Load Balancer (ALB)

Traffic Distribution & Routing
Application Load Balancer configuration
ALB (listeners, rules, target groups)

📖 Understanding Application Load Balancer

Application Load Balancer (ALB) is a Layer 7 (application layer) load balancer that distributes incoming HTTP/HTTPS traffic across multiple targets (your ECS tasks) in multiple Availability Zones.

What ALB does:

  • Layer 7 Routing: Routes based on HTTP/HTTPS content (URL path, host header)
  • Path-Based Routing: Routes /api/* to backend, / to frontend
  • Health Checks: Continuously monitors target health
  • High Availability: Automatically distributes across multiple AZs

How ALB Works:

  1. User requests come to ALB DNS name
  2. ALB examines the request path
  3. If path starts with /api/*, routes to backend target group
  4. If path is anything else, routes to frontend target group
  5. ALB performs health checks on targets
  6. Only healthy targets receive traffic
10

CloudWatch Log Groups

Centralized Logging

📖 Understanding CloudWatch Logs

Amazon CloudWatch Logs is a service for monitoring, storing, and accessing log files from AWS resources and applications.

You need to see what’s happening in your containers. CloudWatch Logs provides visibility into application behavior, errors, and performance – essential for debugging and monitoring.

You’ll create two log groups:

  • /ecs/frontend – Frontend application logs
  • /ecs/backend – Backend application logs
11

Systems Manager Parameter Store

Secure Configuration Storage
Parameter Store parameters
Parameter Store (runtime configuration & secrets)

📖 Understanding Parameter Store

AWS Systems Manager Parameter Store provides secure, hierarchical storage for configuration data and secrets.

Store database credentials, API URLs, and other configuration securely here. ECS tasks can access these values without hardcoding secrets in your code – much safer than putting passwords in environment variables or code.

Two types of parameters:

  • String: Plain text values (e.g., URLs, ports)
  • SecureString: Encrypted values (e.g., passwords, API keys)

✅ You’ve Learned the Basics!

Now that you understand what each resource does, proceed to Tab 2: Step-by-Step Setup to learn how to create them.

💡 Pro Tip: Bookmark this page – you’ll want to reference these concepts as you build. Understanding the “why” makes troubleshooting much easier later.

📋 CI/CD Overview

This section covers setting up automated deployment using GitHub Actions. Once configured, every push to your main branch will automatically build Docker images, push them to ECR, and update your ECS services.

Prerequisites: All infrastructure resources from Steps 1-11 must be completed.

📦 Application Repositories

You need to set up CI/CD for both repositories:

Note: Each repository requires its own GitHub Actions workflow file and secrets configuration. The workflow process is the same for both, but with different repository names, ECR repositories, and ECS services.

CI/CD Pipeline Architecture

GitHub Repository
Code Push
GitHub Actions
Build & Deploy
ECR
Push Image
ECS Service
Update & Deploy
Frontend CI/CD pipeline
CI/CD (Frontend)
Backend CI/CD pipeline
CI/CD (Backend)
12

GitHub Actions CI/CD Setup

Automated Deployment Pipeline

📖 Understanding GitHub Actions

GitHub Actions is a CI/CD platform that automates your software workflows. When you push code to GitHub, Actions can:

  • Build: Compile your application and create Docker images
  • Test: Run automated tests
  • Deploy: Push images to ECR and update ECS services

Manual deployments are error-prone and time-consuming. CI/CD ensures consistent, automated deployments every time you push code – no more “it works on my machine” issues.

Step 1: Configure GitHub Secrets

1

Navigate to Repository Settings

  • Go to your GitHub repository
  • Click “Settings” tab
  • Click “Secrets and variables” → “Actions”
2

Create AWS Credentials Secret

Click “New repository secret” and add:

  • Name: AWS_ACCESS_KEY_ID
  • Value: Your AWS access key ID
  • Click “Add secret”
⚠️ Security: Never commit AWS credentials to your repository. Always use GitHub Secrets.
3

Add Remaining AWS Secrets

Add these AWS credential secrets (one at a time):

Secret NameDescription
AWS_SECRET_ACCESS_KEYYour AWS secret access key
AWS_REGIONe.g., us-east-1
💡 Note: You don’t need to add ECR_REPOSITORY, ECS_SERVICE, or ECS_CLUSTER as secrets. These values are hardcoded in the workflow file’s env section, which is the recommended approach.

🔄 How the GitHub Actions Workflow Works

The CI/CD pipeline automates the entire deployment process. Here’s how it works step by step:

  1. Trigger: When you push code to the main branch, GitHub Actions automatically detects the change and starts the workflow.
  2. Checkout Code: The workflow checks out your repository code to the GitHub Actions runner (a virtual machine).
  3. Configure AWS Credentials: Uses the AWS credentials stored in GitHub Secrets to authenticate with your AWS account.
  4. Login to ECR: Authenticates Docker with Amazon ECR so it can push images.
  5. Build Docker Image: Builds your application into a Docker image using the Dockerfile in your repository.
  6. Tag Image: Tags the image with the Git commit SHA (unique identifier) for version tracking.
  7. Push to ECR: Uploads the Docker image to your ECR repository.
  8. Download Task Definition: Retrieves the current ECS task definition from AWS.
  9. Update Task Definition: Updates the task definition with the new image URI (points to the newly pushed image).
  10. Deploy to ECS: Updates the ECS service with the new task definition, triggering a rolling deployment.
  11. Wait for Stability: Waits for the new tasks to become healthy before completing.

Rolling Deployment Process:

  • ECS starts new tasks with the updated image
  • New tasks register with the target group and pass health checks
  • ALB gradually shifts traffic from old tasks to new tasks
  • Old tasks are stopped once new tasks are healthy
  • This ensures zero-downtime deployments

For Frontend Repository: The workflow builds the Next.js application, pushes to frontend-repo in ECR, and updates frontend-service in ECS.

For Backend Repository: The workflow builds the Node.js/TypeScript backend, pushes to backend-repo in ECR, and updates backend-service in ECS.

Step 2: Create GitHub Actions Workflow

1

Create Workflow Directory

  • In your repository root, create: .github/workflows/
  • Create file: .github/workflows/deploy.yml
2

Frontend Workflow Example

For the Frontend Repository (github.com/m-saad-siddique/frontend), add this content to deploy.yml:

name: Deploy Frontend to ECS on: push: branches: [ main ] env: AWS_REGION: us-east-1 ECR_REPOSITORY: frontend-repo ECS_SERVICE: frontend-service ECS_CLUSTER: ecs-production-cluster ECS_TASK_DEFINITION: frontend-task CONTAINER_NAME: frontend jobs: deploy: name: Deploy runs-on: ubuntu-latest steps: – name: Checkout uses: actions/checkout@v4 – name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: ${{ env.AWS_REGION }} – name: Login to Amazon ECR id: login-ecr uses: aws-actions/amazon-ecr-login@v2 – name: Build, tag, and push image to Amazon ECR id: build-image env: ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }} IMAGE_TAG: ${{ github.sha }} run: | docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG . docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG echo “image=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG” >> $GITHUB_OUTPUT – name: Download task definition run: | aws ecs describe-task-definition \ –task-definition $ECS_TASK_DEFINITION \ –query taskDefinition > task-definition.json – name: Fill in the new image ID in the Amazon ECS task definition id: task-def uses: aws-actions/amazon-ecs-render-task-definition@v1 with: task-definition: task-definition.json container-name: ${{ env.CONTAINER_NAME }} image: ${{ steps.build-image.outputs.image }} – name: Deploy Amazon ECS task definition uses: aws-actions/amazon-ecs-deploy-task-definition@v1 with: task-definition: ${{ steps.task-def.outputs.task-definition }} service: ${{ env.ECS_SERVICE }} cluster: ${{ env.ECS_CLUSTER }} wait-for-service-stability: true
3

Backend Workflow

For the Backend Repository (github.com/m-saad-siddique/backend), use the same workflow structure but update these environment variables:

Environment VariableFrontend ValueBackend Value
ECR_REPOSITORYfrontend-repobackend-repo
ECS_SERVICEfrontend-servicebackend-service
ECS_TASK_DEFINITIONfrontend-taskbackend-task
CONTAINER_NAMEfrontendbackend-app
💡 Tip: Also update the workflow name in the YAML file. Change name: Deploy Frontend to ECS to name: Deploy Backend to ECS for clarity.
4

Create Task Definition File

  • Create task-definition.json in repository root
  • Copy the task definition JSON from ECS Console
  • GitHub Actions will update the image URI automatically

Step 3: Test the Pipeline

1

Commit and Push

  • Commit the workflow file and task definition
  • Push to main branch
2

Monitor Deployment

  • Go to GitHub → “Actions” tab
  • Watch the workflow run
  • Check ECS Console to see service updating

📋 Workflow Execution Summary

When you push code to the Frontend Repository:

  1. GitHub Actions triggers the workflow
  2. Builds Next.js frontend application into Docker image
  3. Tags image with commit SHA (e.g., frontend-repo:abc123)
  4. Pushes image to frontend-repo in ECR
  5. Downloads current frontend-task task definition
  6. Updates task definition with new image URI
  7. Deploys updated task definition to frontend-service
  8. ECS performs rolling update (starts new tasks, shifts traffic, stops old tasks)
  9. New frontend version is live on ALB

When you push code to the Backend Repository:

  1. GitHub Actions triggers the workflow
  2. Builds Node.js/TypeScript backend application into Docker image
  3. Tags image with commit SHA (e.g., backend-repo:xyz789)
  4. Pushes image to backend-repo in ECR
  5. Downloads current backend-task task definition
  6. Updates task definition with new image URI
  7. Deploys updated task definition to backend-service
  8. ECS performs rolling update (starts new tasks, shifts traffic, stops old tasks)
  9. New backend version is live and accessible via /api/* routes
🔄 Independent Deployments: Frontend and backend deployments are completely independent. You can deploy one without affecting the other. Each repository has its own workflow, ECR repository, and ECS service.

✅ CI/CD Setup Complete!

Your pipeline is now configured. Every push to main will automatically:

  1. Build your Docker image
  2. Push to ECR
  3. Update ECS service with new image
  4. Perform rolling deployment

Next Steps:

  • Push code to main branch in either repository
  • Monitor the deployment in GitHub Actions tab
  • Verify the new version is running in ECS Console
  • Test your application via the ALB DNS name

🛠️ Implementation Guide

This section provides detailed, step-by-step instructions for creating each AWS resource. Follow these steps in order – each resource depends on previously created resources.

Prerequisites: Make sure you’ve read Tab 1 to understand what each resource does.

Estimated Time: 2-3 hours for complete setup

🚀 Automated Provisioning with Scripts

Prefer automated provisioning? Instead of manually creating resources through the AWS Console, you can use our automated bash scripts to provision all infrastructure resources.

📦 Public Repository: The complete set of provisioning scripts is available in our public GitHub repository:

🔗 https://github.com/m-saad-siddique/ecs-resource-provision-scripts

📁 Local Scripts Directory: If you want to provision this infrastructure using scripts, use the scripts/ directory in this project. The scripts directory contains:

  • Step-by-step bash scripts that provision all resources using AWS CLI
  • Configuration file (config.sh) for customizing your deployment
  • Deploy-all script to run all provisioning steps in sequence
  • Individual scripts for each resource (VPC, IAM, ECR, RDS, ECS, ALB, etc.)

💡 Quick Start with Scripts:

  1. Navigate to the scripts/ directory in this project
  2. Configure config.sh with your AWS profile, region, and project settings
  3. Run ./deploy-all.sh to provision all resources automatically
  4. Or run individual scripts in order (01-networking.sh, 02-iam-roles.sh, etc.)

📖 For detailed script documentation: See the README.md file in the scripts/ directory for complete instructions, prerequisites, and troubleshooting.

Note: The manual step-by-step instructions below are still available if you prefer to create resources through the AWS Console. Choose the method that works best for you!

⚠️ Important: Follow these steps in the exact order presented. Each step depends on resources created in previous steps. Skipping steps will cause errors – trust me, I’ve been there.

📝 Quick Reference

Each resource below includes detailed step-by-step instructions. Click on each resource card to see the complete implementation guide.

Implementation Steps (In Order)

✅ Setup Checklist

Complete these resources in order:

📖 Detailed Instructions

The detailed step-by-step instructions for creating each resource are available in the original documentation. Each resource includes:

  • Navigation steps (where to click in AWS Console)
  • Configuration details (what values to enter)
  • Verification steps (how to confirm it’s working)
  • Important notes and warnings

Note: All the detailed implementation steps from the original guide are still available. Refer to Tab 1 for understanding what each resource does, then follow the detailed steps in your AWS Console.

1

Virtual Private Cloud (VPC)

Network Foundation
VPC overview
VPC overview
Public and private subnets
Subnets (public/private)
Route tables
Route tables
NAT gateway
NAT Gateway

📖 Understanding VPC

A Virtual Private Cloud (VPC) is your own isolated network environment within AWS. Think of it as your private data center in the cloud where you have complete control over:

  • IP Address Ranges: Define your own CIDR blocks (e.g., 10.0.0.0/16)
  • Subnets: Divide your network into public (internet-facing) and private (internal) subnets
  • Routing: Control how traffic flows between subnets and to the internet
  • Security: Implement network-level security with security groups and network ACLs

VPC provides network isolation and security. Your ECS tasks, RDS database, and load balancer all run within this isolated network, protected from the public internet except through controlled entry points.

⚠️ Pro Tip: Use the “VPC and more” option in AWS Console to automatically create subnets, NAT gateways, and route tables. This saves time and ensures proper configuration – manually creating these is tedious and error-prone.

How to Create It:

1

Navigate to VPC Console

  • Log in to AWS Management Console
  • Search for “VPC” in the top search bar
  • Click on “VPC” service
2

Create VPC

  • Click “Create VPC” button (orange button, top right)
  • Select “VPC and more” option
  • Fill in the configuration:

Name tag: ecs-production-vpc
IPv4 CIDR block: 10.0.0.0/16
IPv6 CIDR block: No IPv6 (unchecked)
Tenancy: Default
Number of Availability Zones: 2
Number of public subnets: 2
Number of private subnets: 2
NAT gateways: 1 per AZ (creates 2 NAT gateways)
VPC endpoints: None (default)

  • Click “Create VPC”
  • Wait 2-3 minutes for creation
3

Verify DNS Settings

  • Click on your VPC name to view details
  • Go to “Actions” → “Edit VPC settings”
  • Ensure “DNS hostnames” is Enabled
  • Click “Save changes”
4

Document Important Information

Write down these values (you’ll need them later):

  • VPC ID: e.g., vpc-0123456789abcdef0
  • Public Subnet IDs: Two subnet IDs (one per AZ)
  • Private Subnet IDs: Two subnet IDs (one per AZ)
  • Internet Gateway ID: e.g., igw-0123456789abcdef0
  • NAT Gateway IDs: Two NAT gateway IDs
✅ What was created: The “VPC and more” option automatically created:
  • VPC with DNS hostnames enabled
  • Internet Gateway (attached to VPC)
  • 2 Public subnets (one per availability zone)
  • 2 Private subnets (one per availability zone)
  • 2 NAT Gateways (one in each public subnet)
  • Route tables (public routes to IGW, private routes to NAT)
2

Security Groups

Virtual Firewalls
Security group rules example
Security groups (inbound/outbound rules)

📖 What are Security Groups?

Security Groups act as virtual firewalls that control inbound and outbound traffic for your AWS resources. Key characteristics:

  • Stateful: If you allow inbound traffic, the response is automatically allowed outbound
  • Resource-Level: Attached to network interfaces (EC2, RDS, ALB, ECS tasks)
  • Rule-Based: Define rules based on protocol, port, and source/destination
  • Default Deny: All traffic is denied by default unless explicitly allowed

Security groups provide the first line of defense, ensuring only authorized traffic reaches your resources. You’ll create three security groups:

  • ALB-SG: Allows HTTP/HTTPS from internet to load balancer
  • ECS-SG: Allows traffic from ALB to ECS tasks (ports 80, 3001)
  • RDS-SG: Allows PostgreSQL traffic from ECS tasks only

Create ALB Security Group:

1

Navigate to Security Groups

  • In VPC Console, click “Security Groups” in left sidebar
  • Click “Create security group”
2

Configure Basic Settings

  • Security group name: ALB-SG
  • Description: Security group for Application Load Balancer
  • VPC: Select ecs-production-vpc
3

Add Inbound Rules

Click “Add rule” twice to add:

  • Rule 1:
    • Type: HTTP
    • Source: 0.0.0.0/0
    • Description: Allow HTTP from internet
  • Rule 2:
    • Type: HTTPS
    • Source: 0.0.0.0/0
    • Description: Allow HTTPS from internet

Leave outbound rules as default (allows all)

4

Create and Document

  • Click “Create security group”
  • Note down the Security Group ID (e.g., sg-0123456789abcdef0)

Create ECS Security Group:

1

Create New Security Group

  • Click “Create security group” again
  • Name: ECS-SG
  • Description: Security group for ECS tasks
  • VPC: Select your VPC
2

Add Inbound Rules

Add two rules (click “Add rule” for each):

  • Rule 1 – Port 80:
    • Type: Custom TCP
    • Port range: 80
    • Source: Select “Custom” → Choose ALB-SG security group
    • Description: Allow port 80 from ALB
  • Rule 2 – Port 3001:
    • Type: Custom TCP
    • Port range: 3001
    • Source: Select “Custom” → Choose ALB-SG security group
    • Description: Allow port 3001 from ALB
Why reference ALB-SG? This ensures only traffic from the load balancer can reach your ECS tasks, not direct internet traffic. This is a security best practice.
3

Click “Create security group” and note the Security Group ID

Create RDS Security Group:

1

Create Security Group

  • Click “Create security group”
  • Name: RDS-SG
  • Description: Security group for RDS database
  • VPC: Select your VPC
2

Add Inbound Rule

  • Click “Add rule”
  • Type: PostgreSQL (or Custom TCP)
  • Port range: 5432
  • Source: Select “Custom” → Choose ECS-SG security group
  • Description: Allow PostgreSQL from ECS tasks
🔒 Security Note: Only ECS tasks can access the database. No direct internet access is allowed, making your database secure.
3

Click “Create security group” and note the Security Group ID

3

IAM Roles

Access Control & Permissions
ECS task execution role
ECS Task Execution Role
Frontend task role
Frontend Task Role
Backend task role
Backend Task Role

📖 Understanding IAM Roles

IAM (Identity and Access Management) Roles are AWS identities with permission policies that define what actions can be performed. For ECS, you’ll work with two types:

  • Task Execution Role: Used by ECS infrastructure to pull Docker images from ECR, write logs to CloudWatch, and read secrets from Parameter Store. This is required – without it, ECS can’t pull your images.
  • Task Role: Used by your application code running inside containers to access AWS services (S3, DynamoDB, etc.). This is optional but recommended if your app needs AWS API access.

Roles follow the principle of least privilege – each component only gets the permissions it needs. The execution role handles infrastructure tasks, while the task role handles application-level AWS API calls.

Create ECS Task Execution Role:

1

Navigate to IAM

  • Search for “IAM” in AWS Console
  • Click “Roles” in left sidebar
  • Click “Create role”
2

Select Trusted Entity

  • Select “AWS service”
  • Under “Use case”, select “Elastic Container Service”
  • Select “Elastic Container Service Task”
  • Click “Next”
3

Attach Permissions

  • Search for and select: AmazonECSTaskExecutionRolePolicy
  • This provides permissions for:
    • Pulling images from ECR
    • Writing logs to CloudWatch
    • Basic Parameter Store access
  • Click “Next”
4

Name the Role

  • Role name: ecsTaskExecutionRole
  • Description: Role for ECS task execution - allows pulling images, writing logs, reading parameters
  • Click “Create role”
5

Add Parameter Store Permissions

  • Click on the role name ecsTaskExecutionRole
  • Click “Add permissions” → “Create inline policy”
  • Click “JSON” tab
  • Paste this policy:
{ “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Action”: [ “ssm:GetParameters”, “ssm:GetParameter”, “ssm:GetParametersByPath” ], “Resource”: “arn:aws:ssm:*:*:parameter/ecs/*” }, { “Effect”: “Allow”, “Action”: [“kms:Decrypt”], “Resource”: “*”, “Condition”: { “StringEquals”: { “kms:ViaService”: “ssm.*.amazonaws.com” } } } ] }
  • Click “Next”
  • Policy name: ParameterStoreAccess
  • Click “Create policy”

Create Frontend Task Role:

1

Create Role

  • Go to IAM → Roles → “Create role”
  • Select “AWS service” → “Elastic Container Service” → “Elastic Container Service Task”
  • Click “Next” (no permissions needed here)
  • Role name: frontendTaskRole
  • Description: Task role for frontend application
  • Click “Create role”
Note: Task roles are optional. Add CloudWatch permissions later if your app needs to publish custom metrics.

Create Backend Task Role:

1

Create Role

  • Click “Create role” again
  • Same process: “AWS service” → “Elastic Container Service” → “Elastic Container Service Task”
  • Role name: backendTaskRole
  • Description: Task role for backend application
  • Click “Create role”
Future Enhancement: If your backend needs S3 access, add S3 permissions to this role later.
4

Elastic Container Registry (ECR)

Docker Image Storage
ECR repositories and images
ECR repositories (frontend-repo / backend-repo)

📖 Understanding ECR

Amazon Elastic Container Registry (ECR) is a fully managed Docker container registry that stores, manages, and deploys Docker container images. It’s integrated with ECS, so pushing images and having ECS pull them is seamless.

Think of ECR as a private Docker Hub for your AWS account. When you build your application, you push the image to ECR, and ECS pulls it automatically when starting tasks. This keeps your images secure and close to where they’re used.

Create Frontend Repository:

1

Navigate to ECR

  • Search for “ECR” in AWS Console
  • Click “Repositories” → “Create repository”
2

Configure Repository

  • Visibility: Private
  • Repository name: frontend-repo
  • Tag immutability: Mutable
  • Scan on push: Enable (for security scanning)
  • Click “Create repository”
3

Configure Lifecycle Policy

  • Click “Lifecycle policy” tab
  • Click “Create lifecycle policy”
  • Select “Keep last 10 images”
  • Click “Create policy”
This automatically deletes old images, keeping only the 10 most recent to save storage costs.
4

Note Repository URI

Copy the repository URI (e.g., 123456789012.dkr.ecr.us-east-1.amazonaws.com/frontend-repo)

Create Backend Repository:

1

Repeat the same process for backend-repo

  • Name: backend-repo
  • Enable scan on push
  • Create lifecycle policy (keep last 10)
  • Note the URI
5

RDS PostgreSQL Database

Managed Database Service

📖 Understanding RDS

Amazon Relational Database Service (RDS) is a managed database service that makes it easy to set up, operate, and scale relational databases in the cloud. Instead of managing database servers yourself, RDS handles:

  • Provisioning: Automated database setup
  • Backups: Automated daily backups with point-in-time recovery
  • Scaling: Easy vertical scaling (instance size) and read replicas
  • Maintenance: Automated patching and updates
  • Monitoring: CloudWatch integration for metrics

Your backend application needs a database to store data. RDS provides a managed PostgreSQL database that’s secure, scalable, and reliable – no need to worry about database server management.

Step 1: Create DB Subnet Group

1

Navigate to RDS

  • Search for “RDS” in AWS Console
  • Click “Subnet groups” in left sidebar
  • Click “Create DB subnet group”
2

Configure Subnet Group

  • Name: ecs-db-subnet-group
  • Description: Subnet group for ECS RDS database
  • VPC: Select ecs-production-vpc
  • Availability Zones: Select both AZs (us-east-1a, us-east-1b)
  • Subnets: Select both PRIVATE subnets (one from each AZ)
🔒 Critical: Only select PRIVATE subnets. Never put RDS in public subnets for security.
3

Click “Create” button

Step 2: Create PostgreSQL Database

1

Start Database Creation

  • Click “Databases” in left sidebar
  • Click “Create database”
2

Engine Configuration

  • Engine type: PostgreSQL
  • Version: Latest stable (e.g., PostgreSQL 15.x)
  • Templates: Production (or Free tier for testing)
3

Settings

  • DB instance identifier: ecs-postgres-db
  • Master username: postgres
  • Master password: Create a strong password (save securely!)
  • Confirm password: Re-enter password
4

Instance Configuration

  • DB instance class: db.t3.medium (or db.t3.micro/small for testing)
5

Storage

  • Storage type: General Purpose SSD (gp3)
  • Allocated storage: 20 GB (minimum)
  • Storage autoscaling: Enable (optional)
6

Connectivity

  • VPC: Select your VPC
  • DB subnet group: ecs-db-subnet-group
  • Public access: No (critical!)
  • VPC security group: Choose existing → Select RDS-SG
7

Database Options

  • Initial database name: fileanalyzer
8

Backup & Maintenance

  • Enable automated backups: Yes (recommended)
  • Backup retention period: 7 days
  • Enable auto minor version upgrade: Yes
9

Click “Create database” and wait 5-10 minutes

10

Document Database Endpoint

Once status is “Available”, click on database name and note:

  • Endpoint: e.g., ecs-postgres-db.xxxxx.us-east-1.rds.amazonaws.com
  • Port: 5432

You’ll need these for Parameter Store.

6

ECS Cluster

Container Orchestration Platform
ECS cluster overview
ECS Cluster (services and tasks)

📖 Understanding ECS Clusters

Amazon ECS Cluster is a logical grouping of tasks or services. Think of it as a container for your containerized applications. For AWS Fargate:

  • No Infrastructure Management: AWS manages all servers, containers, and networking – you just define what to run
  • Namespace: Groups your services and tasks together in one logical unit
  • Serverless: No EC2 instances to manage – Fargate handles everything
  • Scaling: Automatically handles scaling based on demand and service configuration
  • Integration: Works seamlessly with ALB, CloudWatch, ECR, Parameter Store, and other AWS services
  • Multi-Service Support: Can run multiple services (frontend, backend) in the same cluster

The cluster is where your ECS services run. It provides the platform for running your containerized applications. Without a cluster, you cannot run ECS services or tasks – think of it as the foundation.

Important concepts:

  • Cluster: The container that holds your services
  • Service: Maintains a desired number of running tasks (e.g., frontend-service, backend-service)
  • Task: A running instance of a task definition (your containerized application)
  • Task Definition: Blueprint that describes how to run your container (CPU, memory, image, environment variables)

🔄 ECS Creation Flow

The ECS setup process follows this order:

  1. Create Cluster (this step) – Sets up the container platform
  2. Create Task Definitions – Define how containers should run
  3. Create Services – Run and maintain tasks based on task definitions
  4. Services connect to ALB – Traffic flows from ALB to services

Note: You’ll create the cluster now, but you’ll need task definitions and services created later (Steps 9 and 11).

How to Create It:

1

Navigate to ECS Console

  • Log in to AWS Management Console
  • In the top search bar, type “ECS”
  • Click on “Elastic Container Service” from the services list
  • You’ll be taken to the ECS Dashboard
2

Access Clusters Section

  • In the left navigation menu, click “Clusters”
  • You’ll see a list of existing clusters (if any) or an empty state
  • Click the orange “Create cluster” button (top right)
3

Configure Cluster Settings

You’ll see a form with several sections. Fill in the following:

Cluster configuration:
Cluster name: ecs-production-cluster

Infrastructure:
• Select “AWS Fargate (Serverless)” radio button
• This means AWS manages all infrastructure – no EC2 instances to manage

Monitoring:
CloudWatch Container Insights: Leave unchecked (optional, adds cost)
• You can enable this later if you need advanced monitoring

💡 Why Fargate? Fargate is serverless – you don’t manage servers. AWS handles all infrastructure, patching, and scaling. Great for most applications.
4

Review and Create

  • Review your cluster configuration
  • Click the orange “Create” button at the bottom
  • A success message will appear: “Cluster created successfully”
5

Verify Cluster Creation

  • You’ll be redirected to the cluster details page
  • Cluster status should show “Active”
  • Note the cluster name: ecs-production-cluster
  • You’ll see tabs for: Services, Tasks, Metrics, Logs, etc.
✅ Cluster Created! The cluster is now ready. You’ll add services to it in Step 11. For now, proceed to create the Application Load Balancer (Step 7).
⚠️ Things to Remember:
  • The cluster itself doesn’t cost anything – you only pay for running tasks
  • You can create multiple services in the same cluster (frontend and backend share the same cluster)
  • Cluster settings can be modified later if needed
  • Double-check you’re in the correct AWS region (top right corner) – resources are region-specific
7

Application Load Balancer (ALB)

Traffic Distribution & Routing
Application Load Balancer configuration
ALB (listeners, rules, target groups)

📖 Understanding Application Load Balancer

Application Load Balancer (ALB) is a Layer 7 (application layer) load balancer that distributes incoming HTTP/HTTPS traffic across multiple targets (your ECS tasks) in multiple Availability Zones. Think of it as a smart traffic director.

What ALB does:

  • Layer 7 Routing: Routes based on HTTP/HTTPS content (URL path, host header, HTTP headers) – not just IP addresses
  • Path-Based Routing: Routes /api/* to backend service, / to frontend service
  • Health Checks: Continuously monitors target health and routes only to healthy targets
  • SSL Termination: Handles SSL/TLS certificates – decrypts HTTPS traffic
  • High Availability: Automatically distributes across multiple Availability Zones
  • Auto Scaling: Automatically scales to handle traffic increases
  • Sticky Sessions: Can maintain session affinity (optional)

The ALB gives you a single entry point (one DNS name) for your entire application. It distributes requests across multiple ECS tasks for better performance, and if one task fails, traffic automatically routes to healthy tasks. This enables zero-downtime deployments and path-based routing (API calls to backend, web pages to frontend).

How ALB Works:

  1. User requests come to ALB DNS name (e.g., ecs-alb-xxx.elb.amazonaws.com)
  2. ALB examines the request path
  3. If path starts with /api/*, routes to backend target group
  4. If path is anything else, routes to frontend target group
  5. ALB performs health checks on targets
  6. Only healthy targets receive traffic

🔄 ALB Creation Flow

The ALB setup process follows this order:

  1. Create ALB – Set up the load balancer infrastructure
  2. Create Target Groups – Define where traffic should go (frontend-tg, backend-tg)
  3. Configure Listeners – Set up HTTP/HTTPS listeners on ports 80/443
  4. Configure Listener Rules – Set up path-based routing rules
  5. Connect ECS Services – ECS services register with target groups (done in Step 11)

Important: Target groups are created during ALB creation, but ECS tasks will register with them later when you create ECS services.

⚠️ Prerequisites:
  • VPC with public subnets must be created (Step 1)
  • ALB-SG security group must be created (Step 2)
  • ALB must be in public subnets to receive internet traffic

Step 1: Create ALB

1

Navigate to EC2 Console

  • Log in to AWS Management Console
  • In the top search bar, type “EC2”
  • Click on “EC2” service
  • In the left navigation menu, scroll down and click “Load Balancers”
  • You’ll see a list of existing load balancers (if any) or an empty state
  • Click the orange “Create Load Balancer” button (top right)
2

Select Load Balancer Type

You’ll see three load balancer types. Select:

  • Application Load Balancer (the first option)
  • This is the correct type for HTTP/HTTPS traffic routing
  • Click the orange “Create” button under Application Load Balancer
💡 Why Application Load Balancer? ALB operates at Layer 7 (application layer), allowing path-based routing. Network Load Balancer (Layer 4) doesn’t support path-based routing.
3

Basic Configuration

Fill in the basic settings:

Basic configuration:
Name: ecs-alb

Scheme:
• Select “Internet-facing” radio button
• This allows traffic from the internet
• Internal load balancers are only for VPC-internal traffic

IP address type:
• Select “IPv4” (default)

4

Network Mapping

Configure where the ALB will be placed:

Network mapping:
VPC: Select your VPC (ecs-production-vpc)

Mappings:
• Check both availability zones (us-east-1a and us-east-1b)
• For each AZ, select the PUBLIC subnet from the dropdown
• Do NOT select private subnets

🔒 Critical: ALB MUST be in public subnets. If you place it in private subnets, it won’t receive internet traffic and your application won’t be accessible.
5

Security Groups

Configure firewall rules:

  • You’ll see a default security group selected
  • Click the “X” next to the default security group to remove it
  • Click “Add security group”
  • Select ALB-SG from the list
  • This security group allows HTTP (port 80) and HTTPS (port 443) from the internet
Why ALB-SG? This security group was created in Step 2 and allows inbound HTTP/HTTPS traffic from anywhere (0.0.0.0/0), which is necessary for an internet-facing load balancer.
6

Listeners and Routing

Configure how traffic is received and routed:

Listeners:
Protocol: HTTP
Port: 80

Default action:
• Click the dropdown and select “Create target group”
• This will open a new page to create the frontend target group
• This gets configured in the next step

Note: You can add HTTPS (port 443) later if you have an SSL certificate. For now, HTTP on port 80 works fine.

🎯 Understanding Target Groups

A Target Group is a collection of targets (ECS tasks) that receive traffic from the load balancer. Think of it as a group of servers that handle the same type of requests.

Key Points:

  • Frontend Target Group: Contains frontend ECS tasks (port 80)
  • Backend Target Group: Contains backend ECS tasks (port 3001)
  • Health Checks: ALB continuously checks if targets are healthy
  • Auto Registration: ECS tasks automatically register with target groups when services are created

Step 2: Create Frontend Target Group

1

Target Group Configuration

When you clicked “Create target group” in the ALB creation, you should now be on the target group creation page. If not, go to EC2 → Target Groups → Create target group.

Basic configuration:
Target type: Select “IP addresses”
• This is required for Fargate tasks (they don’t use EC2 instances)
• Instance type is for EC2-based ECS tasks

Target group name: frontend-tg

Protocol: HTTP
Port: 80 (matches frontend container port)

VPC: Select your VPC (ecs-production-vpc)

2

Health Check Configuration

Configure how ALB checks if targets are healthy:

Health checks:
Health check protocol: HTTP
Health check path: /
• ALB will check this path to determine if the target is healthy
• Frontend should return 200 OK for the root path

Advanced health check settings: (Click to expand)
Healthy threshold: 2 consecutive successful checks
Unhealthy threshold: 2 consecutive failed checks
Timeout: 5 seconds (max time to wait for response)
Interval: 30 seconds (time between checks)
Success codes: 200 (default)

How Health Checks Work: ALB sends HTTP GET requests to the health check path every 30 seconds. If it gets 2 successful responses in a row, the target is marked healthy. If it gets 2 failures in a row, the target is marked unhealthy and removed from rotation.
3

Register Targets (Skip for Now)

  • Click “Next” button
  • On the “Register targets” page, you’ll see an empty list
  • Do NOT register targets manually – ECS will automatically register tasks when you create the service
  • Click “Next” again to skip this step
Why Skip? When you create ECS services in Step 11, they will automatically register their tasks with the target groups. Manual registration is not needed for ECS services.
4

Review and Create

  • Review your target group configuration
  • Click “Create target group” button
  • You’ll see a success message

Step 3: Complete ALB Creation

1

Return to ALB Creation

  • Go back to ALB creation page
  • Under “Default action”, select frontend-tg
  • Click “Create load balancer”
  • Wait 2-3 minutes for creation

Step 4: Create Backend Target Group

1

Create Target Group

  • Go to EC2 → “Target Groups” → “Create target group”
  • Target type: IP addresses
  • Name: backend-tg
  • Protocol: HTTP
  • Port: 3001
  • VPC: Select your VPC
  • Health check path: /api/health
2

Click “Next” → “Create target group”

Step 5: Configure Listener Rules

1

Configure Routing Rules

  • Go to “Load Balancers” → Click ecs-alb
  • Click “Listeners” tab → Click listener (port 80)
  • Click “Manage rules”
2

Add API Route

  • Click “Add rule”
  • IF (conditions): Click “Add condition” → Select “Path” → Enter /api/*
  • THEN (actions): Click “Add action” → “Forward to” → Select backend-tg
  • Priority: 100
3

Click “Save changes”

Verify default rule forwards to frontend-tg

4

Document ALB DNS Name

In ALB details, under “Basic configuration”, copy the DNS name:

e.g., ecs-alb-123456789.us-east-1.elb.amazonaws.com

You’ll need this for Parameter Store.

✅ ALB Setup Complete!

What You’ve Created:

  • ✅ Application Load Balancer (ecs-alb) in public subnets
  • ✅ Frontend Target Group (frontend-tg) for port 80
  • ✅ Backend Target Group (backend-tg) for port 3001
  • ✅ Listener on port 80 with routing rules
  • ✅ Path-based routing: /api/* → backend, / → frontend

How Traffic Flows:

  1. User requests http://ALB-DNS-NAME/ → Routes to frontend-tg → Frontend ECS tasks
  2. User requests http://ALB-DNS-NAME/api/health → Routes to backend-tg → Backend ECS tasks
  3. ALB performs health checks on all targets
  4. Only healthy targets receive traffic
  5. If a target becomes unhealthy, ALB stops sending traffic to it

Next Steps: Create CloudWatch Log Groups (Step 8), then Task Definitions (Step 9), then connect ECS Services to these target groups (Step 11).

8

CloudWatch Log Groups

Centralized Logging

📖 Understanding CloudWatch Logs

Amazon CloudWatch Logs is a service for monitoring, storing, and accessing log files from AWS resources and applications. Here’s what it offers:

  • Centralized Logging: All container logs in one place
  • Search & Filter: Easy log searching and filtering
  • Retention Policies: Automatically delete old logs
  • Integration: ECS automatically sends container logs

You need to see what’s happening in your containers. CloudWatch Logs provides visibility into application behavior, errors, and performance – essential for debugging.

⚠️ CRITICAL: These log groups MUST be created BEFORE creating ECS task definitions. If they don’t exist, ECS tasks will fail to start.
1

Navigate to CloudWatch

  • Search for “CloudWatch” in AWS Console
  • Click “Log groups” in left sidebar
  • Click “Create log group”
2

Create Frontend Log Group

  • Log group name: /ecs/frontend
  • Log retention: Select retention period (e.g., 30 days)
  • Click “Create”
3

Create Backend Log Group

  • Click “Create log group” again
  • Log group name: /ecs/backend
  • Log retention: 30 days
  • Click “Create”
9

ECS Task Definitions

Application Blueprints

📖 Understanding Task Definitions

ECS Task Definition is a blueprint for your application. It describes:

  • Container Images: Which Docker images to use
  • Resources: CPU and memory allocation
  • Ports: Which ports to expose
  • Environment Variables: Configuration values
  • Secrets: Secure values from Parameter Store
  • Logging: Where to send container logs
  • IAM Roles: Permissions for the task

Task definitions tell ECS exactly how to run your containers. You create the definition once, and ECS uses it to launch tasks – think of it as a recipe that ECS follows every time.

Create Frontend Task Definition:

1

Navigate to Task Definitions

  • Go to ECS Console → “Task definitions”
  • Click “Create new task definition”
2

Basic Configuration

  • Task definition family: frontend-task
  • Launch type: Fargate
  • Operating system: Linux/X86_64
3

Task Size

  • Task CPU: 0.5 vCPU (512)
  • Task memory: 1 GB (1024)
4

IAM Roles

  • Task role: frontendTaskRole
  • Task execution role: ecsTaskExecutionRole
5

Add Container

Scroll down and click “Add container”:

  • Container name: frontend
  • Image URI: Your ECR URI + tag (e.g., 123456789012.dkr.ecr.us-east-1.amazonaws.com/frontend-repo:latest)
  • Essential container: Yes (checked)
6

Port Mappings

  • Click “Add port mapping”
  • Container port: 80
  • Protocol: TCP
7

Environment Variables – Secrets

  • Under “Environment variables” → “Secrets”
  • Click “Add secret”
  • Key: NEXT_PUBLIC_API_URL
  • Value from: Parameter Store → /ecs/frontend/API_URL
Note: You’ll create this parameter in Step 10. For now, you can use a placeholder or create it first.
8

Logging Configuration

  • Under “Logging”:
  • Log driver: awslogs
  • Log options:
    • awslogs-group: /ecs/frontend
    • awslogs-region: Your region (e.g., us-east-1)
    • awslogs-stream-prefix: ecs
9

Click “Add” → Scroll to bottom → Click “Create”

Create Backend Task Definition:

1

Create New Task Definition

  • Click “Create new task definition”
  • Family: backend-task
  • Launch type: Fargate
  • Task CPU: 1 vCPU (1024)
  • Task memory: 2 GB (2048)
  • Task role: backendTaskRole
  • Task execution role: ecsTaskExecutionRole
2

Add Container

  • Click “Add container”
  • Container name: backend-app
  • Image URI: Backend ECR URI
3

Port & Environment

  • Container port: 3001
  • Under “Environment variables” → “Environment”:
  • Add: PORT=3001
4

Add Secrets from Parameter Store

Under “Secrets”, add these (click “Add secret” for each):

  • DB_HOST/ecs/backend/DB_HOST
  • DB_PORT/ecs/backend/DB_PORT
  • DB_NAME/ecs/backend/DB_NAME
  • DB_USER/ecs/backend/DB_USER (SecureString)
  • DB_PASSWORD/ecs/backend/DB_PASSWORD (SecureString)
  • FRONTEND_URL/ecs/backend/FRONTEND_URL
5

Logging

  • Log driver: awslogs
  • awslogs-group: /ecs/backend
  • awslogs-region: Your region
  • awslogs-stream-prefix: ecs
6

Click “Add” → “Create”

10

Systems Manager Parameter Store

Secure Configuration Storage
Parameter Store parameters
Parameter Store (runtime configuration & secrets)

📖 Understanding Parameter Store

AWS Systems Manager Parameter Store provides secure, hierarchical storage for configuration data and secrets. Here’s what it offers:

  • Secure Storage: Encrypted storage for sensitive data
  • Types: String (plain text) or SecureString (encrypted)
  • Integration: ECS automatically injects values as environment variables
  • Versioning: Track changes to parameters
  • Access Control: IAM-based permissions

Store database credentials, API URLs, and other configuration securely here. ECS tasks can access these values without hardcoding secrets in your code – much safer than putting passwords in environment variables.

Create Backend Parameters:

1

Navigate to Parameter Store

  • Search for “Systems Manager” in AWS Console
  • Click “Parameter Store” in left sidebar
  • Click “Create parameter”
2

Create DB_HOST Parameter

  • Name: /ecs/backend/DB_HOST
  • Description: RDS database endpoint
  • Tier: Standard
  • Type: String
  • Value: Your RDS endpoint (from Step 5)

Click “Create parameter”

3

Create Remaining Backend Parameters

Create these parameters (one at a time):

NameTypeValue
/ecs/backend/DB_PORTString5432
/ecs/backend/DB_NAMEStringfileanalyzer
/ecs/backend/DB_USERSecureStringYour database username
/ecs/backend/DB_PASSWORDSecureStringYour database password
/ecs/backend/FRONTEND_URLStringALB DNS name (e.g., http://ecs-alb-xxx.elb.amazonaws.com)

Create Frontend Parameters:

1

Create API_URL Parameter

  • Click “Create parameter”
  • Name: /ecs/frontend/API_URL
  • Type: String
  • Value: ALB DNS name + /api (e.g., http://ecs-alb-xxx.elb.amazonaws.com/api)

Click “Create parameter”

11

ECS Services

Running Applications
ECS services (frontend and backend)
ECS Services (frontend-service / backend-service)

📖 Understanding ECS Services

ECS Service runs and maintains a specified number of instances of a task definition simultaneously. Here’s what services handle:

  • Desired Count: Maintains specified number of running tasks
  • Load Balancing: Integrates with ALB to distribute traffic
  • Auto Scaling: Can scale based on demand
  • Rolling Updates: Zero-downtime deployments
  • Health Monitoring: Restarts unhealthy tasks

Services ensure your application is always running. If a task crashes, the service automatically starts a new one. Services handle deployments and scaling – you don’t have to babysit your containers.

Create Frontend Service:

1

Navigate to Cluster

  • Go to ECS → Clusters → ecs-production-cluster
  • Click “Services” tab
  • Click “Create”
2

Configure Service

  • Launch type: Fargate
  • Task definition: frontend-task (latest revision)
  • Service name: frontend-service
  • Desired tasks: 2
3

Networking

  • VPC: Select your VPC
  • Subnets: Select both PRIVATE subnets
  • Security groups: Select ECS-SG
  • Auto-assign public IP: DISABLED
4

Load Balancing

  • Select “Application Load Balancer”
  • Load balancer name: Select ecs-alb
  • Click “Add to load balancer”
  • Production listener port: 80:HTTP
  • Target group name: Select frontend-tg
5

Deployment Configuration

  • Deployment type: Rolling update
  • Minimum healthy percent: 100
  • Maximum percent: 200
6

Deployment Circuit Breaker

  • Enable “Enable rollback on failure”
7

Click “Create” and wait for service to stabilize

Create Backend Service:

1

Create Service

Same process as frontend, but with these differences:

  • Task definition: backend-task
  • Service name: backend-service
  • Target group: backend-tg
2

Click “Create” and wait for service to stabilize

✅ Infrastructure Setup Complete!

All resources are now created. Verify deployment:

  1. Check ECS Services show “Running” status
  2. Check Target Groups show “healthy” targets
  3. Test ALB DNS name in browser
  4. Check CloudWatch Logs for application output

✅ Infrastructure Setup Checklist

Complete these resources before proceeding to CI/CD:

❓ Frequently Asked Questions & Scenarios

This section answers common questions in a simple way so both technical and non-technical people can understand the overall concepts.

Great for: Reviewing concepts, interviewing, or explaining the architecture to your team.

30 Common Questions (with Simple Explanations)

Q1–Q10

Core Concepts & Big Picture

  1. Q1: What problem does this whole ECS setup solve?
    A: It gives you a repeatable, reliable way to run your app in the cloud. Instead of running code on one fragile server, you package your app in containers, run them on ECS, and put an ALB in front so users always hit a healthy instance.
  2. Q2: What is a container in simple words?
    A: A container is like a small, portable box that holds your app and everything it needs to run (libraries, runtime). If it runs on your laptop, it will run the same way in AWS.
  3. Q3: How is ECS different from just using EC2 directly?
    A: With EC2 you manage servers yourself (patching, scaling, scheduling). With ECS, you tell AWS “run these containers like this” and ECS handles placement, health checks, and restarts for you.
  4. Q4: Why do we need an ALB in front of ECS?
    A: ECS tasks (containers) come and go. The ALB is the stable entry point that knows which tasks are healthy and where to send each request. It also supports path-based routing, HTTPS, and health checks.
  5. Q5: Why do we use Fargate instead of EC2 launch type?
    A: Fargate is “serverless for containers”. You don’t manage EC2 instances; you only define CPU/memory per task. AWS manages the underlying servers, which reduces ops work.
  6. Q6: What is the difference between an ECS Task Definition and an ECS Service?
    A: The task definition is the blueprint (what to run, image, CPU, env vars). The service is the “manager” that ensures a certain number of those tasks are running and connected to the load balancer.
  7. Q7: Why do we keep the database (RDS) in private subnets?
    A: So it’s not exposed to the internet. Only ECS tasks inside the VPC can talk to it, which is a big security win.
  8. Q8: Why do we split frontend and backend into separate services?
    A: They scale differently and have different behavior. Splitting them lets you scale, deploy, and debug each independently.
  9. Q9: What is the role of ECR in this architecture?
    A: ECR is a private Docker image registry. It stores your built images so ECS can pull them whenever it needs to start new tasks.
  10. Q10: Why do we use CI/CD instead of manual deployments?
    A: CI/CD turns deployment into a repeatable, automated process. Every push builds, tests, and deploys in the same way, reducing human error and speeding up releases.
Q11–Q20

Networking, Security & Reliability

  1. Q11: Why do we have both public and private subnets?
    A: Public subnets host internet-facing resources (like the ALB). Private subnets host sensitive workloads (ECS tasks, RDS) that should not be directly reachable from the internet.
  2. Q12: How do ECS tasks access the internet if they are in private subnets?
    A: They go out through a NAT Gateway. This lets tasks download packages or talk to external APIs without being directly reachable from the internet.
  3. Q13: What are Security Groups in this setup?
    A: Security Groups are virtual firewalls. For example, the ALB SG allows traffic from the internet on HTTP/HTTPS, and the ECS SG only allows traffic from the ALB SG.
  4. Q14: How do we ensure only ECS tasks can talk to the database?
    A: The RDS Security Group only allows connections from the ECS Security Group on the database port. No other source is allowed.
  5. Q15: What happens if one ECS task crashes?
    A: The ECS service detects that the task is unhealthy and starts a new one. The ALB stops sending traffic to the unhealthy task automatically.
  6. Q16: How do health checks protect users?
    A: The ALB pings a health check endpoint (like / or /api/health). Only tasks that respond correctly are considered healthy and receive user traffic.
  7. Q17: What is the difference between task execution role and task role?
    A: The execution role is used by ECS to pull images and write logs. The task role is used by your app code to call AWS services (like S3, Parameter Store).
  8. Q18: How do we keep secrets (like DB password) out of the code?
    A: We store them in Parameter Store as SecureString values and let ECS inject them into the task as environment variables or secrets. The code just reads env vars.
  9. Q19: How does scaling work in this design?
    A: The ECS service can be configured to scale the number of running tasks up or down based on metrics (CPU, requests). The ALB automatically balances traffic across all tasks.
  10. Q20: What is the main single point of failure to watch out for?
    A: Misconfigured health checks, database availability, and limits like connection pools. The infrastructure is multi-AZ and managed, but bad configuration can still cause downtime.
Q21–Q30

CI/CD, Task Definitions & Operations

  1. Q21: What does the GitHub Actions workflow actually do step by step?
    A: It checks out code, configures AWS credentials, logs in to ECR, builds a Docker image, pushes it to ECR, updates the task definition with the new image, and asks ECS to deploy it.
  2. Q22: Why do we tag images with the Git SHA?
    A: The Git SHA is a unique identifier for each commit. Tagging images with it makes deployments traceable and makes rollbacks easier.
  3. Q23: How does ECS know which image to run after a deployment?
    A: The task definition is updated with the new image URI (including tag). The ECS service always runs the latest revision of the task definition.
  4. Q24: What is a “task definition revision”?
    A: Every time you change the task definition (for example, new image or env var), ECS creates a new numbered revision. Services usually point to the latest revision.
  5. Q25: How do CloudWatch Logs fit into this?
    A: Each container sends its stdout/stderr logs to a log group (like /ecs/frontend). You can search, filter, and debug issues centrally in CloudWatch.
  6. Q26: What should I look at first when the app is “not working”?
    A: Check ECS service events (for deployment errors), ALB target health, and CloudWatch logs for the relevant task. This usually tells you if it’s a code error, health check issue, or config problem.
  7. Q27: How do I roll back to a previous version?
    A: You can either re-deploy an older image tag via CI/CD, or manually update the service to use an older task definition revision that points to the previous image.
  8. Q28: Why do we separate environment variables and secrets?
    A: Regular env vars hold non-sensitive config (like URLs); secrets are sensitive values stored in a secure service (Parameter Store) and injected securely. This reduces risk if logs or screenshots leak.
  9. Q29: How does this design support future growth?
    A: You can add more services (microservices), scale tasks independently, add caching layers, and extend CI/CD, all without redesigning the whole system. The building blocks are already in place.
  10. Q30: How would you explain this architecture to a non‑technical stakeholder?
    A: We have a front door (ALB) that safely receives traffic, smart workers (ECS tasks) that handle requests, a secure vault (RDS + Parameter Store) for data and secrets, and an assembly line (CI/CD) that ships new versions automatically and safely.

10 Practical Scenarios

S1–S5

Operational Troubleshooting

  1. S1: New deployment, ALB shows 502/503 errors.
    How to think: Check ALB target health → confirm health check path exists and returns 200 → verify container listens on the right port → check CloudWatch logs for startup errors.
  2. S2: Tasks keep going to Stopped state right after starting.
    How to think: Look at the “stopped reason” in ECS → inspect container logs → common issues: wrong env vars, DB not reachable, port conflicts, or app crashes on boot.
  3. S3: Database CPU is high and app feels slow.
    How to think: Check connection counts, slow queries, and indexing → consider connection pooling, query optimization, and potentially scaling RDS (larger instance or read replicas).
  4. S4: Suddenly many 5xx errors during peak traffic.
    How to think: Check ECS service metrics (CPU/memory) → if tasks are maxed out, increase desired count or add auto scaling → ensure ALB health checks and timeouts are set correctly.
  5. S5: A secret (like DB password) changed and app can’t connect.
    How to think: Update the value in Parameter Store → confirm the ECS task definition still references the correct parameter name → redeploy tasks so they pick up the new value.
S6–S10

Design & Architecture Decisions

  1. S6: You need to add a new microservice (for example, reporting API).
    Approach: Create a new ECR repo and ECS task definition, a new ECS service, and (optionally) a new path rule on the ALB (like /reports/*). Wire it into CI/CD the same way as backend.
  2. S7: You want a staging environment separate from production.
    Approach: Duplicate the stack in another VPC or another set of subnets with separate ECS cluster, RDS, and ALB. Use different DNS and separate CI/CD workflows pointing to staging resources.
  3. S8: Frontend needs to call a third‑party API from the backend only.
    Approach: Keep the API key in Parameter Store, inject into backend task, and call the third‑party service from backend code. Frontend never sees the secret.
  4. S9: You want zero‑downtime database migrations.
    Approach: Use migration tools (Prisma, Flyway, Liquibase) in a separate step or job before/with deployment, design migrations to be backward compatible, and roll out app changes after DB schema supports both versions.
  5. S10: You want to explain cloud costs to a manager.
    Approach: Break it down: “We pay for compute (Fargate tasks), storage (RDS & ECR), traffic and security (ALB & NAT), and observability (CloudWatch). We can control each of these with scaling, instance sizes, and retention policies.”

Complete Terraform Guide
Learn more about Rails
Learn more about Mern Stack
Learn more about DevOps

2 thoughts on “The Ultimate AWS ECS Guide: From Setup to CI/CD Pipeline”

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top