Terraform, ECS Fargate & RDS: Modernize Your AWS Infra
Hey guys! Today, we're diving deep into a super important topic for anyone running applications on AWS: moving your infrastructure to Infrastructure as Code (IaC) using Terraform, and leveraging powerful managed services like ECS Fargate and RDS. If you're still manually clicking around in the AWS console or managing your own EC2 instances, you're in for a treat. We're going to ditch the manual hassle and embrace automation, scalability, and reliability. This isn't just about upgrading; it's about building a future-proof foundation for your applications. Let's get this party started!
Why Bother Migrating? The Pain of Manual Infrastructure
So, why are we even talking about this? Let's be real, managing infrastructure manually, especially with EC2 instances, is a pain. Think about it: you're spending ages setting up servers, configuring them, making sure they're patched, and then crossing your fingers that everything stays up. The current manual EC2 setup often leads to a bunch of headaches we're all too familiar with. First off, manual server management is a time sink. Every time you need to scale, update, or fix something, it's a hands-on process. And when it comes to automatic scaling, forget about it. Your app might get slammed with traffic, and your EC2 setup won't magically add more servers to handle the load. This often leads to a single point of failure. If that one server goes down, your whole application is toast. Plus, you're stuck with the joy of OS maintenance β patching, updates, security fixes β all on you. High availability and backups? Yep, that's manual too, which is prone to human error and downtime. And the biggest kicker? Your infrastructure isn't versioned. You can't easily go back to a previous state, track changes, or collaborate effectively because there's no code to review.
The Awesome Benefits of IaC with ECS Fargate and RDS
Now, let's talk about the shiny new world we're moving into. By adopting Infrastructure as Code (IaC) with Terraform, combined with ECS Fargate and RDS, we unlock a treasure trove of benefits. First and foremost, you get infrastructure as code. This means your entire setup is versioned, auditable, and perfectly reproducible. Imagine deploying the exact same environment across development, staging, and production with a single command! Automatic scalability is a game-changer, especially with ECS Fargate. It handles scaling your containers up and down based on demand, so you don't have to. Say goodbye to downtime worries because high availability is built-in, often with multi-AZ deployments. Your database is now managed with RDS, meaning automatic backups, patching, and failover. The best part? With ECS Fargate, you're dealing with serverless containers. No more server management! You pay only for what you use, leading to optimized costs. And finally, deploying new versions of your application becomes a breeze with zero-downtime deployments. It's a complete paradigm shift, folks!
Our Proposed Architecture: A Bird's-Eye View
Let's visualize the slick new setup we're aiming for. At the top, we have the Internet, hitting our Route 53 for DNS resolution. Optionally, we can add CloudFront for content delivery network (CDN) magic to speed things up. Then, the traffic flows into an Application Load Balancer (ALB), which intelligently distributes requests. From the ALB, traffic is directed to our ECS Services running on Fargate. We'll have separate services, perhaps one for the API (astro/api) and another for the web frontend (astro/web), both running as serverless containers. For asynchronous tasks, we'll have a celery-worker service. These containers can leverage ElastiCache (Redis) for caching and connect to our managed PostgreSQL database running on RDS. The entire network is defined within a VPC, with a clear subnet strategy: public subnets for the ALB, private subnets for the ECS tasks, and dedicated database subnets for RDS. This tiered approach ensures security and proper traffic flow. We're talking about a well-defined VPC CIDR block (10.0.0.0/16), split across multiple Availability Zones (AZs) for high availability. This architecture is designed for scalability, resilience, and ease of management, all driven by code.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Internet β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
ββββββββΌβββββββ
β Route 53 β (DNS)
ββββββββ¬βββββββ
β
ββββββββΌβββββββ
β CloudFront β (CDN - opcional)
ββββββββ¬βββββββ
β
βββββββββββββΌββββββββββββ
β Application Load β
β Balancer (ALB) β
βββββββββββββ¬ββββββββββββ
β
βββββββββββββ΄βββββββββββββ
β β
ββββββΌβββββ βββββΌβββββ
β ECS β β ECS β
β Service β β Serviceβ
β API β β Web β
β(Fargate)β β(Fargate)β
ββββββ¬βββββ βββββ¬βββββ
β β
β ββββββββββββ β
βββββΊβ Redis βββββββββ
βElastiCacheβ
ββββββββββββ
β
β ββββββββββββ
βββββΊβ RDS β
βPostgreSQLβ
β Multi-AZ β
ββββββββββββ
VPC: 10.0.0.0/16
βββ Public Subnets (2 AZs)
β βββ 10.0.1.0/24 (us-east-1a)
β βββ 10.0.2.0/24 (us-east-1b)
βββ Private Subnets (2 AZs)
β βββ 10.0.10.0/24 (us-east-1a)
β βββ 10.0.20.0/24 (us-east-1b)
βββ Database Subnets (2 AZs)
βββ 10.0.100.0/24 (us-east-1a)
βββ 10.0.200.0/24 (us-east-1b)
Terraform Project Structure: Organized for Success
To manage this complexity, a well-defined Terraform project structure is crucial. We're adopting a modular approach, separating concerns into reusable components. At the top level, we have environments/ where each subdirectory (dev, staging, production) holds the specific configuration for that environment, including main.tf, environment-specific variables (terraform.tfvars), and backend configuration (backend.tf) for storing Terraform state remotely (usually in an S3 bucket). This keeps our configurations DRY (Don't Repeat Yourself) and manageable.
Beneath environments/, we have the modules/ directory. This is where the magic of reusability happens. We've broken down our infrastructure into logical modules:
networking/: Handles everything VPC-related β VPC itself, subnets (public, private, database), Internet Gateway, NAT Gateways, route tables, and security groups.ecs/: Manages the ECS cluster, Fargate services, task definitions, and auto-scaling configurations. This is where our containers live.rds/: Provisions and configures our managed PostgreSQL database instances, including Multi-AZ deployments, backups, and storage settings.elasticache/: Sets up our Redis cache clusters for faster data retrieval.alb/: Configures the Application Load Balancer, including listeners, rules, and target groups for routing traffic.ecr/: Creates and manages Amazon Elastic Container Registry (ECR) repositories for storing our Docker images.secrets/: Integrates with AWS Secrets Manager to securely store sensitive information like database credentials and API keys.cloudwatch/: Sets up logging, metrics, and alarms for monitoring our infrastructure's health and performance.
Finally, we have a scripts/ directory for handy utility scripts like deploy.sh, destroy.sh, and plan.sh to streamline common Terraform operations. This organized structure not only makes our Terraform code maintainable and scalable but also promotes collaboration among team members.
terraform/
βββ environments/
β βββ dev/
β β βββ main.tf
β β βββ terraform.tfvars
β β βββ backend.tf
β βββ staging/
β β βββ main.tf
β β βββ terraform.tfvars
β β βββ backend.tf
β βββ production/
β βββ main.tf
β βββ terraform.tfvars
β βββ backend.tf
βββ modules/
β βββ networking/
β β βββ main.tf # VPC, Subnets, IGW, NAT
β β βββ variables.tf
β β βββ outputs.tf
β βββ ecs/
β β βββ main.tf # ECS Cluster, Services, Tasks
β β βββ variables.tf
β β βββ outputs.tf
β βββ rds/
β β βββ main.tf # RDS PostgreSQL
β β βββ variables.tf
β β βββ outputs.tf
β βββ elasticache/
β β βββ main.tf # Redis cluster
β β βββ variables.tf
β β βββ outputs.tf
β βββ alb/
β β βββ main.tf # Application Load Balancer
β β βββ variables.tf
β β βββ outputs.tf
β βββ ecr/
β β βββ main.tf # Container Registry
β β βββ variables.tf
β β βββ outputs.tf
β βββ secrets/
β β βββ main.tf # Secrets Manager
β β βββ variables.tf
β β βββ outputs.tf
β βββ cloudwatch/
β βββ main.tf # Logs, Metrics, Alarms
β βββ variables.tf
β βββ outputs.tf
βββ scripts/
βββ deploy.sh
βββ destroy.sh
βββ plan.sh
Key Infrastructure Components: The Building Blocks
Let's break down the essential pieces of our infrastructure, guys. Each component plays a vital role in delivering a scalable, reliable, and secure application. We're using Terraform modules to define and manage these resources, ensuring consistency and reusability across different environments.
1. Networking (VPC): The Foundation
Our entire AWS infrastructure will reside within a Virtual Private Cloud (VPC). This provides a logically isolated network space. We'll define a clear IP addressing scheme, like 10.0.0.0/16, and divide it into specific subnets. We need public subnets for resources that need direct internet access, such as our Application Load Balancer (ALB). Then, we have private subnets where our ECS Fargate tasks will run β these shouldn't be directly accessible from the internet. Finally, database subnets are designated for our RDS instance, ensuring it's isolated and secure. We'll configure an Internet Gateway for the VPC, NAT Gateways to allow instances in private subnets to access the internet for outbound traffic (like pulling dependencies), and robust Route Tables to control traffic flow. Security Groups will act as virtual firewalls, controlling inbound and outbound traffic to our resources. The networking module handles all of this, making it easy to spin up a secure and well-structured network.
module "vpc" {
source = "./modules/networking"
project_name = "astro-natal-chart"
environment = var.environment
vpc_cidr = "10.0.0.0/16"
availability_zones = ["us-east-1a", "us-east-1b"]
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnet_cidrs = ["10.0.10.0/24", "10.0.20.0/24"]
database_subnet_cidrs = ["10.0.100.0/24", "10.0.200.0/24"]
enable_nat_gateway = true
single_nat_gateway = var.environment == "dev" ? true : false
}
Resources created:
- VPC with CIDR 10.0.0.0/16
- 2 Public Subnets (for ALB)
- 2 Private Subnets (for ECS tasks)
- 2 Database Subnets (for RDS)
- Internet Gateway
- NAT Gateway(s)
- Route Tables
- Security Groups
2. ECR (Container Registry): Your Image Hub
Before we can run our applications in containers, we need a place to store those container images. That's where Amazon Elastic Container Registry (ECR) comes in. Terraform will set up ECR repositories for each of our services (e.g., astro/api, astro/web, astro/celery-worker). We'll configure image tag mutability and enable scan on push for security vulnerability detection. We can also implement lifecycle policies to automatically clean up old, unused images, keeping our registry tidy and cost-effective. This module ensures our Docker images are securely stored and accessible by ECS.
module "ecr" {
source = "./modules/ecr"
repositories = [
"astro/api",
"astro/web",
"astro/celery-worker"
]
image_tag_mutability = "MUTABLE"
scan_on_push = true
lifecycle_policy = {
keep_last_n_images = 10
}
}
3. RDS PostgreSQL: Managed Database Power
Moving our database to Amazon RDS (Relational Database Service) is a massive win. We're opting for PostgreSQL, a robust and popular open-source database. Terraform will provision a managed PostgreSQL instance. For production environments, we'll enable Multi-AZ deployments for high availability and automatic failover. We'll configure automatic backups with a retention period (e.g., 7 days for production) and set maintenance windows for applying patches with minimal disruption. Encryption at rest will be enabled to protect sensitive data. We'll also configure storage auto-scaling so the database can grow as needed. The instance will be placed in our private database subnets and associated with a strict security group. Performance Insights can also be enabled for better database performance monitoring. This takes a huge burden off our shoulders compared to managing self-hosted databases on EC2.
module "rds" {
source = "./modules/rds"
identifier = "astro-postgres-${var.environment}"
engine_version = "16.1"
instance_class = var.environment == "production" ? "db.t4g.medium" : "db.t4g.micro"
allocated_storage = 20
max_allocated_storage = 100
storage_encrypted = true
database_name = "astro_${var.environment}"
master_username = "astro_admin"
multi_az = var.environment == "production" ? true : false
backup_retention_period = var.environment == "production" ? 7 : 1
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
deletion_protection = var.environment == "production" ? true : false
skip_final_snapshot = var.environment != "production"
vpc_security_group_ids = [module.vpc.rds_security_group_id]
db_subnet_group_name = module.vpc.database_subnet_group_name
}
ConfiguraΓ§Γ΅es:
- PostgreSQL 16
- Multi-AZ for production (HA)
- Automatic backups (7 days in production)
- Encryption at rest
- Storage auto-scaling
- Performance Insights enabled
4. ElastiCache Redis: Blazing Fast Caching
To improve application performance and reduce load on our database, we'll implement Amazon ElastiCache for Redis. Redis is an incredibly fast, in-memory data store often used for caching, session management, and real-time use cases. Terraform will provision a Redis cluster, configured with the appropriate node type and number of nodes. For production, we'll enable automatic failover and multi-AZ deployments to ensure high availability for our cache. The Redis cluster will be placed within our private network and secured with a dedicated security group. This module allows us to easily add a caching layer to our architecture.
module "redis" {
source = "./modules/elasticache"
cluster_id = "astro-redis-${var.environment}"
engine_version = "7.0"
node_type = var.environment == "production" ? "cache.t4g.medium" : "cache.t4g.micro"
num_cache_nodes = var.environment == "production" ? 2 : 1
parameter_group_family = "redis7"
port = 6379
subnet_group_name = module.vpc.elasticache_subnet_group_name
security_group_ids = [module.vpc.redis_security_group_id]
automatic_failover_enabled = var.environment == "production"
multi_az_enabled = var.environment == "production"
}
5. ECS Fargate Cluster: Serverless Containers
This is where our applications will actually run! Amazon Elastic Container Service (ECS) with Fargate is our chosen platform. Fargate is a serverless compute engine for containers, meaning we don't have to manage underlying EC2 instances. Terraform will set up the ECS Cluster and define our services (api, web, celery-worker). For each service, we'll specify the Docker image from ECR, CPU and memory requirements, desired number of tasks, and importantly, environment variables and secrets. We'll integrate with AWS Secrets Manager to securely inject sensitive data like database credentials into our containers. Autoscaling is configured here, defining minimum and maximum capacities and target utilization metrics (CPU and memory) to automatically adjust the number of running tasks. This module is the heart of our containerized deployment.
module "ecs" {
source = "./modules/ecs"
cluster_name = "astro-cluster-${var.environment}"
services = {
api = {
name = "astro-api"
image = "${module.ecr.repository_urls["astro/api"]}:latest"
cpu = var.environment == "production" ? 512 : 256
memory = var.environment == "production" ? 1024 : 512
desired_count = var.environment == "production" ? 2 : 1
container_port = 8000
health_check_path = "/health"
environment_variables = {
ENVIRONMENT = var.environment
}
secrets = {
DATABASE_URL = "${module.secrets.secret_arns["database_url"]}"
SECRET_KEY = "${module.secrets.secret_arns["secret_key"]}"
REDIS_URL = "${module.secrets.secret_arns["redis_url"]}"
GOOGLE_CLIENT_ID = "${module.secrets.secret_arns["google_client_id"]}"
# ... outros secrets
}
autoscaling = {
min_capacity = var.environment == "production" ? 2 : 1
max_capacity = var.environment == "production" ? 10 : 2
target_cpu_utilization = 70
target_memory_utilization = 80
}
}
web = {
name = "astro-web"
image = "${module.ecr.repository_urls["astro/web"]}:latest"
cpu = 256
memory = 512
desired_count = var.environment == "production" ? 2 : 1
container_port = 80
health_check_path = "/"
}
celery = {
name = "astro-celery-worker"
image = "${module.ecr.repository_urls["astro/celery-worker"]}:latest"
cpu = 256
memory = 512
desired_count = 1
# Sem load balancer (worker assΓncrono)
}
}
vpc_id = module.vpc.vpc_id
private_subnet_ids = module.vpc.private_subnet_ids
alb_target_group_arns = module.alb.target_group_arns
}
6. Application Load Balancer: Traffic Director
Our Application Load Balancer (ALB) acts as the single entry point for external traffic. Terraform will configure the ALB within our public subnets, associate it with our VPC, and set up listeners for HTTP and HTTPS. Crucially, we'll configure SSL/TLS termination using an ACM certificate for secure HTTPS connections. The ALB will have target groups pointing to our ECS services (API and Web). Listener rules will define how traffic is routed based on path patterns (e.g., /api/* goes to the API service, / goes to the web service). We'll also set up a rule to redirect HTTP traffic to HTTPS for enhanced security. Health checks are configured for each target group to ensure the ALB only sends traffic to healthy instances.
module "alb" {
source = "./modules/alb"
name = "astro-alb-${var.environment}"
vpc_id = module.vpc.vpc_id
public_subnet_ids = module.vpc.public_subnet_ids
# Certificado SSL para HTTPS
certificate_arn = var.acm_certificate_arn
target_groups = {
api = {
port = 8000
protocol = "HTTP"
health_check_path = "/health"
health_check_interval = 30
health_check_timeout = 5
healthy_threshold = 2
unhealthy_threshold = 3
}
web = {
port = 80
protocol = "HTTP"
health_check_path = "/"
}
}
listeners = [
{
port = 443
protocol = "HTTPS"
default_action = "forward to web"
rules = [
{
path_pattern = "/api/*"
target_group = "api"
},
{
path_pattern = "/docs*"
target_group = "api"
}
]
},
{
port = 80
protocol = "HTTP"
default_action = "redirect to HTTPS"
}
]
}
7. Secrets Manager: Secure Credential Handling
Storing sensitive information like database passwords, API keys, and secret keys directly in your Terraform code or environment variables is a big no-no. AWS Secrets Manager provides a secure and centralized way to manage these secrets. Terraform will create secrets within Secrets Manager and store them securely. Our ECS tasks will then reference these secrets via their ARNs, and AWS will inject them as environment variables into the running containers. This ensures that sensitive data is never exposed in code repositories or logs, significantly enhancing our security posture.
module "secrets" {
source = "./modules/secrets"
secrets = {
database_url = {
description = "PostgreSQL connection string"
secret_string = "postgresql+asyncpg://${module.rds.master_username}:${var.db_password}@${module.rds.endpoint}/${module.rds.database_name}"
}
redis_url = {
description = "Redis connection string"
secret_string = "redis://${module.redis.primary_endpoint_address}:6379/0"
}
secret_key = {
description = "JWT secret key"
secret_string = var.secret_key
}
google_client_secret = {
description = "Google OAuth client secret"
secret_string = var.google_client_secret
}
# ... outros secrets
}
}
8. CloudWatch Monitoring: Keeping an Eye on Things
What good is a robust infrastructure if you don't know if it's healthy? AWS CloudWatch is our go-to for monitoring. Terraform will configure log groups for our ECS services, ensuring that application logs are captured and retained for a specified period (longer for production). We'll also define alarms based on key metrics like CPU utilization, memory usage, database connections, and ALB request counts (e.g., 5xx errors). These alarms can trigger actions, such as sending notifications to an SNS topic, allowing us to proactively respond to issues before they impact users.
module "cloudwatch" {
source = "./modules/cloudwatch"
log_groups = [
"/ecs/astro-api",
"/ecs/astro-web",
"/ecs/astro-celery-worker"
]
retention_in_days = var.environment == "production" ? 30 : 7
alarms = {
api_high_cpu = {
metric_name = "CPUUtilization"
comparison_operator = "GreaterThanThreshold"
threshold = 80
evaluation_periods = 2
alarm_actions = [var.sns_topic_arn]
}
rds_high_connections = {
metric_name = "DatabaseConnections"
comparison_operator = "GreaterThanThreshold"
threshold = 80
evaluation_periods = 2
}
alb_5xx_errors = {
metric_name = "HTTPCode_Target_5XX_Count"
comparison_operator = "GreaterThanThreshold"
threshold = 10
evaluation_periods = 1
}
}
}
Automating Everything: GitHub Actions Workflows
To truly embrace IaC and DevOps, we need automation. GitHub Actions will be our CI/CD engine. We'll set up several workflows:
1. Terraform Plan Workflow (terraform-plan.yml)
This workflow runs automatically on pull requests targeting our Terraform code. It performs terraform init and terraform plan, generating an execution plan. The plan output is then commented directly on the pull request, allowing developers to review the proposed infrastructure changes before they are merged. This is a crucial step for preventing unintended changes and ensuring code quality.
name: Terraform Plan
on:
pull_request:
paths:
- 'terraform/**'
jobs:
plan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.6.0
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Terraform Init
run: |
cd terraform/environments/${{ matrix.environment }}
terraform init
- name: Terraform Plan
run: |
cd terraform/environments/${{ matrix.environment }}
terraform plan -out=tfplan
- name: Comment PR
uses: actions/github-script@v7
with:
script: |
const output = `#### Terraform Plan π
```
${process.env.PLAN}
```
`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
})
2. Deploy Infrastructure Workflow (deploy-infrastructure.yml)
This workflow handles the actual deployment of our infrastructure. It can be triggered manually (workflow_dispatch) or on a push to the main branch. It performs terraform init and then terraform apply -auto-approve for the selected environment (dev, staging, or production). This ensures that our infrastructure is consistently provisioned and managed through code.
name: Deploy Infrastructure
on:
push:
branches: [main]
paths:
- 'terraform/**'
workflow_dispatch:
inputs:
environment:
description: 'Environment to deploy'
required: true
type: choice
options:
- dev
- staging
- production
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Terraform Apply
run: |
cd terraform/environments/${{ inputs.environment || 'dev' }}
terraform init
terraform apply -auto-approve
3. Deploy Application Workflow (deploy-application.yml)
This workflow focuses on deploying our application code. It consists of two main jobs:
build-and-push: Checks out the code, logs into ECR, builds the Docker images for our services (API, Web), tags them with the Git commit SHA andlatest, and pushes them to ECR.deploy: After the images are pushed, this job triggers an ECS service update using the AWS CLI. It forces a new deployment for theastro-apiandastro-webservices, causing ECS to pull the new images and deploy them. Awaitstep ensures the deployment completes successfully before the workflow finishes. This workflow automates the entire application deployment pipeline.
name: Deploy Application
on:
push:
branches: [main]
workflow_dispatch:
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v2
- name: Build and push API image
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
ECR_REPOSITORY: astro/api
IMAGE_TAG: ${{ github.sha }}
run: |
docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG -t $ECR_REGISTRY/$ECR_REPOSITORY:latest -f apps/api/Dockerfile apps/api
docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest
- name: Build and push Web image
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
ECR_REPOSITORY: astro/web
IMAGE_TAG: ${{ github.sha }}
run: |
docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG -t $ECR_REGISTRY/$ECR_REPOSITORY:latest -f apps/web/Dockerfile apps/web
docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest
deploy:
needs: build-and-push
runs-on: ubuntu-latest
steps:
- name: Deploy to ECS
run: |
aws ecs update-service \
--cluster astro-cluster-production \
--service astro-api \
--force-new-deployment
aws ecs update-service \
--cluster astro-cluster-production \
--service astro-web \
--force-new-deployment
- name: Wait for deployment
run: |
aws ecs wait services-stable \
--cluster astro-cluster-production \
--services astro-api astro-web
Estimated Costs: Budgeting for Your Infrastructure
Let's talk about the elephant in the room: cost. It's essential to have a realistic estimate of what this new infrastructure will cost. We've broken down the estimated monthly costs for both production and development/staging environments.
Production (High Availability Focus)
For the production environment, where high availability and performance are critical, the costs are higher but justifiable. This includes:
- ECS Fargate (API & Web): Running multiple tasks across different AZs for redundancy. ~$30/month for API (2 tasks @ 0.5 vCPU, 1GB) + ~$15/month for Web (2 tasks @ 0.25 vCPU, 0.5GB) = ~$45/month.
- RDS PostgreSQL (db.t4g.medium Multi-AZ): A robust, highly available database instance. ~$85/month.
- ElastiCache Redis (cache.t4g.medium): In-memory caching for performance. ~$35/month.
- ALB: The load balancer itself. ~$20/month.
- NAT Gateway (2x): Required for outbound internet access from private subnets in multiple AZs. ~$65/month.
- Data Transfer: Estimated traffic costs. ~$10/month.
- CloudWatch Logs & Metrics: For monitoring and logging. ~$5/month.
- Secrets Manager: Securely storing secrets. ~$5/month.
Total Estimated Production Cost: ~$270/month
Dev/Staging (Cost-Optimized)
For development and staging environments, we can significantly reduce costs by using smaller instance types, fewer tasks, and single-AZ deployments where appropriate.
- ECS Fargate (API & Web): Single tasks, smaller resources. ~$8/month for API (1 task @ 0.25 vCPU, 0.5GB) + ~$8/month for Web (1 task @ 0.25 vCPU, 0.5GB) = ~$16/month.
- RDS PostgreSQL (db.t4g.micro Single-AZ): A smaller, non-HA database instance. ~$15/month.
- ElastiCache Redis (cache.t4g.micro): A smaller cache instance. ~$12/month.
- ALB: Still needed for traffic management. ~$20/month.
- NAT Gateway (1x): A single NAT Gateway is usually sufficient for non-production. ~$32/month.
Total Estimated Dev/Staging Cost: ~$95/month
Note: These are estimates and actual costs may vary based on usage, specific configurations, and AWS pricing changes. Always check the latest AWS pricing.
EC2 vs. ECS Fargate: A Quick Comparison
To put things in perspective, let's compare our new approach with the old manual EC2 setup.
| Feature | EC2 (Manual) | ECS Fargate (IaC) | Winner (for most apps) |
|---|---|---|---|
| Cost/Month | $40-50 (can vary wildly) | $95 (dev) / $270 (prod) | EC2 (initially) |
| Management | Manual (OS, patching, scaling) | Fully Managed (Serverless) | ECS Fargate |
| Scalability | Manual, slow, error-prone | Automatic, rapid, reliable | ECS Fargate |
| High Availability | Manual setup required, complex | Built-in (Multi-AZ) | ECS Fargate |
| Database Backup | Manual, risky | Automatic, reliable | ECS Fargate |
| Infrastructure as Code | No (or difficult to implement) | Yes (Terraform) | ECS Fargate |
| Zero-Downtime Deploy | No (or very complex) | Yes | ECS Fargate |
| Complexity | Low (initially) | Medium-High (setup phase) | EC2 (initial setup) |
While EC2 might seem cheaper initially for very simple setups, the operational overhead, lack of scalability, and manual effort quickly outweigh the cost savings. ECS Fargate, powered by IaC, offers a superior long-term solution for most modern applications.
Implementation Tasks: Your Roadmap to Success
Migrating to this new architecture involves several steps. We've outlined a phased approach to make it manageable:
Phase 1: Initial Setup
- [ ] Create AWS Account (if needed)
- [ ] Configure AWS CLI locally
- [ ] Create S3 bucket for Terraform state
- [ ] Create DynamoDB table for state locking
- [ ] Configure GitHub Secrets (AWS credentials)
Phase 2: Terraform Core Modules
- [ ] Set up Terraform project structure
- [ ] Implement the
networkingmodule (VPC, subnets, etc.) - [ ] Implement the
rdsmodule - [ ] Implement the
elasticachemodule - [ ] Implement the
ecrmodule - [ ] Implement the
ecsmodule - [ ] Implement the
albmodule - [ ] Implement the
secretsmodule - [ ] Implement the
cloudwatchmodule
Phase 3: Environment Configurations
- [ ] Create configuration for
devenvironment - [ ] Create configuration for
stagingenvironment - [ ] Create configuration for
productionenvironment - [ ] Test provisioning in the
devenvironment
Phase 4: CI/CD Pipelines
- [ ] Create Terraform Plan workflow (for PRs)
- [ ] Create Terraform Apply workflow (for deployments)
- [ ] Create Build & Push ECR workflow
- [ ] Create Deploy ECS workflow
- [ ] Test the complete pipeline
Phase 5: Security & Compliance
- [ ] Configure SSL/TLS (ACM Certificate)
- [ ] Implement WAF (optional, for advanced security)
- [ ] Configure restrictive Security Groups
- [ ] Enable encryption at rest (RDS, S3)
- [ ] Configure VPC Flow Logs
- [ ] Implement AWS Config for compliance monitoring
Phase 6: Monitoring & Alerting
- [ ] Set up CloudWatch Dashboards
- [ ] Create critical alarms
- [ ] Configure SNS for notifications
- [ ] Implement X-Ray for distributed tracing (optional)
- [ ] Integrate alerts with PagerDuty/Slack
Phase 7: Documentation
- [ ] Document the architecture
- [ ] Create operational runbooks
- [ ] Document disaster recovery procedures
- [ ] Create a troubleshooting guide
Priority and Dependencies
This migration is marked as Low-Medium priority. It's an enhancement that builds upon existing foundations. A key dependency is that Issue #28 (likely the current EC2 setup) should be completed or will be directly replaced by this IaC approach. We also need budget approval for the estimated costs (around $270/month for production) and a clear decision on whether to proceed with the robust ECS solution versus sticking with a simpler EC2 setup.
Labels
devops, infrastructure, terraform, iac, aws, enhancement
References
And there you have it, guys! A comprehensive plan to transform your AWS infrastructure using Terraform, ECS Fargate, and RDS. It's a significant undertaking, but the benefits in terms of automation, scalability, and reliability are immense. Happy coding and happy deploying!