Vertical Scaling on AWS: Practical Guide with EC2 & RDS Examples

Vertical Scaling on AWS: Quick Guide | Randomize Blog

Vertical Scaling on AWS

Quick Reference Guide

What is Vertical Scaling?

Think of vertical scaling like upgrading your laptop’s RAM instead of buying a second laptop. You’re beefing up your existing server—more CPU, more memory, more storage—rather than spinning up additional servers.

I’ve used this approach countless times when an application starts struggling. Instead of refactoring everything for horizontal scaling (which can take weeks), you can often just bump up the instance size and buy yourself time. It’s not always the right solution, but when it fits, it’s a lifesaver.

In simple terms: You’re making your existing instance bigger or smaller, not adding more instances. One server, more power.

When to Use Vertical Scaling

I’ll be honest—vertical scaling isn’t always the answer. But here’s when it actually makes sense:

  • ✅ You’ve got a monolithic app that’s hard to split across servers
  • ✅ Traffic is pretty steady (no crazy spikes at 3 AM)
  • ✅ You need something done fast—like, today fast
  • ✅ Budget is tight and you’re not at massive scale yet
  • ✅ You want to avoid the complexity of load balancers and auto-scaling groups
Skip vertical scaling if: You need 99.9%+ uptime (one server = single point of failure), traffic is all over the place, or you’ve already maxed out the biggest instance type. At that point, you’re better off going horizontal.

AWS Services That Support Vertical Scaling

ServiceHow to ScaleDowntime
EC2Change instance type (stop → modify → start)Yes (2-5 min)
RDSModify DB instance classYes (Multi-AZ: minimal)
ElastiCacheModify node typeYes (with Multi-AZ: minimal)
EBSResize volume, change typeNo (online resize)
LambdaIncrease memory allocationNo
ECSChange task CPU/memoryNo (rolling update)

Practical Example: Task Management API

Let’s walk through a real scenario I’ve dealt with: a Node.js API with PostgreSQL and Redis, all running on one EC2 instance. Docker makes this way easier than installing everything manually—trust me, I’ve done both ways.

Quick Setup

# 1. Launch EC2 instance (m5.large recommended) # 2. Install Docker curl -fsSL https://get.docker.com -o get-docker.sh sudo sh get-docker.sh sudo usermod -aG docker $USER # 3. Install Docker Compose sudo curl -L “https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)” -o /usr/local/bin/docker-compose sudo chmod +x /usr/local/bin/docker-compose # 4. Deploy application git clone https://github.com/your-username/task-management-api.git cd task-management-api docker-compose up -d # 5. Verify curl http://localhost:3000/health

Docker Compose Configuration

version: ‘3.8’ services: app: build: . ports: – “3000:3000” environment: – DB_HOST=postgres – REDIS_HOST=redis depends_on: – postgres – redis postgres: image: postgres:17 environment: – POSTGRES_DB=taskdb – POSTGRES_USER=taskuser – POSTGRES_PASSWORD=${DB_PASSWORD} volumes: – postgres-data:/var/lib/postgresql/data redis: image: redis:7-alpine volumes: – redis-data:/data volumes: postgres-data: redis-data:

How to Scale Vertically

Alright, let’s get into the actual process. I’ve scaled instances dozens of times, and here’s what actually works (and what doesn’t).

Manual Scaling (EC2)

  1. Create snapshot: EC2 → Snapshots → Create snapshot
  2. Stop instance: EC2 → Instances → Stop
  3. Modify type: Actions → Instance Settings → Change Instance Type
  4. Select new type: e.g., m5.large → m5.xlarge
  5. Start instance: EC2 → Instances → Start
  6. Verify: curl http://localhost:3000/health

AWS CLI Method

# Stop instance aws ec2 stop-instances –instance-ids i-xxx # Wait for stopped aws ec2 wait instance-stopped –instance-ids i-xxx # Modify instance type aws ec2 modify-instance-attribute –instance-id i-xxx –instance-type Value=m5.xlarge # Start instance aws ec2 start-instances –instance-ids i-xxx aws ec2 wait instance-running –instance-ids i-xxx

Automatic Scaling with Lambda

Trigger scaling automatically when CloudWatch alarms fire.

Step 1: Create Lambda Function

exports.handler = async (event) => { // Stop → Modify → Start await ec2.stopInstances({ InstanceIds: [instanceId] }).promise(); await ec2.modifyInstanceAttribute({ InstanceId: instanceId, InstanceType: { Value: ‘m5.xlarge’ } }).promise(); await ec2.startInstances({ InstanceIds: [instanceId] }).promise(); return { statusCode: 200, body: ‘Scaling complete’ }; };

Step 2: Update CloudWatch Alarm

aws cloudwatch put-metric-alarm \ –alarm-name “High-CPU-AutoScale” \ –metric-name CPUUtilization \ –namespace AWS/EC2 \ –statistic Average \ –period 300 \ –threshold 75 \ –comparison-operator GreaterThanThreshold \ –evaluation-periods 2 \ –alarm-actions arn:aws:lambda:REGION:ACCOUNT:function:auto-scale-instance

Monitoring & Alerts

You can’t manage what you don’t measure. Before you even think about scaling, get your monitoring in place. Trust me, you’ll want to know what’s happening.

Setup CloudWatch Alarms

# Install CloudWatch Agent wget https://s3.amazonaws.com/amazoncloudwatch-agent/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb sudo dpkg -i -E ./amazon-cloudwatch-agent.deb sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s # Create CPU alarm aws cloudwatch put-metric-alarm \ –alarm-name “High-CPU” \ –metric-name CPUUtilization \ –namespace AWS/EC2 \ –statistic Average \ –period 300 \ –threshold 75 \ –comparison-operator GreaterThanThreshold \ –evaluation-periods 2 \ –alarm-actions arn:aws:sns:REGION:ACCOUNT:topic-name

Key Metrics to Monitor

MetricThresholdAction
CPU Utilization> 75% for 10 minScale up
Memory Usage> 80% for 10 minScale up
Disk Usage> 85%Resize EBS or cleanup
Application Latency> 500msInvestigate & scale

Instance Type Selection

Instance TypevCPURAMUse Case
t3.medium24 GBDevelopment/Testing
m5.large28 GBSmall production
m5.xlarge416 GBMedium production
m5.2xlarge832 GBLarge production

Scaling Path Example

Start: m5.large (2 vCPU, 8 GB) → Scale to: m5.xlarge (4 vCPU, 16 GB) → Scale to: m5.2xlarge (8 vCPU, 32 GB)

Tip: Always create a snapshot before scaling. Monitor for 24-48 hours after scaling to validate performance improvements.

Best Practices (Learned the Hard Way)

I’ve made my share of mistakes scaling instances. Here’s what I wish someone had told me:

  • Always create a snapshot first. Seriously, don’t skip this. I’ve had to restore from backups before, and it’s not fun.
  • Test in staging. Your production environment isn’t the place to learn that your app doesn’t handle the new instance type well.
  • Watch your costs. That m5.2xlarge costs twice as much as m5.xlarge. Make sure you actually need it.
  • Set up alerts before you scale. You want to know immediately if something goes wrong.
  • Use Multi-AZ for RDS. The downtime difference is huge—30 seconds vs 5 minutes. Worth the extra cost.
  • Plan a maintenance window. Even if it’s just 10 minutes, let your users know.
  • Write down what you did. Future you will thank present you when you need to do this again in 6 months.

Common Issues & Solutions

IssueSolution
Instance won’t stopCheck for running processes, force stop if needed
IP address changedUse Elastic IP or update DNS records
Performance not improvedCheck if bottleneck is elsewhere (database, network)
Scaling taking too longCheck RDS events, verify Multi-AZ status

EBS Volume Optimization

Here’s something a lot of people don’t realize: you can often fix performance issues by just upgrading your EBS volume instead of scaling the whole instance. I’ve saved money and time doing this more times than I can count.

EBS volumes can be resized and optimized without stopping the instance (in most cases). If you’re hitting I/O limits or running out of space, this is usually your first move before touching the instance type.

Resize EBS Volume (Online)

Step 1: Modify Volume Size

# Get volume ID VOLUME_ID=$(aws ec2 describe-instances –instance-ids i-xxx \ –query ‘Reservations[0].Instances[0].BlockDeviceMappings[0].Ebs.VolumeId’ –output text) # Resize volume (e.g., 100GB to 200GB) aws ec2 modify-volume –volume-id $VOLUME_ID –size 200 # Wait for modification to complete aws ec2 describe-volumes-modifications –volume-ids $VOLUME_ID \ –query ‘VolumesModifications[0].ModificationState’ –output text

Step 2: Extend File System (Linux)

# For ext4 file system sudo growpart /dev/xvda 1 sudo resize2fs /dev/xvda1 # For xfs file system sudo growpart /dev/xvda 1 sudo xfs_growfs / # Verify new size df -h

Change EBS Volume Type

Upgrade volume type for better performance without resizing:

Volume TypeUse CaseIOPSThroughput
gp3General purpose (default)3,000-16,000125-1,000 MB/s
gp2General purpose (legacy)3-16,000 (baseline)128-250 MB/s
io1/io2High IOPS workloads64-256,0001,000 MB/s
st1Throughput-optimized500500 MB/s
# Change volume type (e.g., gp2 to gp3) aws ec2 modify-volume \ –volume-id vol-xxx \ –volume-type gp3 \ –iops 3000 # Or upgrade to provisioned IOPS aws ec2 modify-volume \ –volume-id vol-xxx \ –volume-type io2 \ –iops 10000

EBS Optimization Best Practices

  • Monitor volume metrics: CloudWatch → EBS → VolumeReadOps, VolumeWriteOps
  • Use gp3 instead of gp2: Better price/performance, configurable IOPS
  • Right-size IOPS: Don’t over-provision (io1/io2 costs more)
  • Separate data volumes: Use separate EBS for database data
  • Enable encryption: Use encrypted volumes for sensitive data
  • Snapshot before changes: Always backup before modifying volumes
Tip: If you’re hitting I/O limits, try upgrading EBS volume type first before scaling the entire instance. It’s often faster and cheaper.

Quick Decision Guide: Vertical vs Horizontal Scaling

I get asked this question all the time: “Should I scale up or scale out?” The answer isn’t always obvious. Use this decision tree—it’s saved me from making the wrong choice more than once.

Scaling Decision Tree

START: Need More Resources?
Q1: Is your application monolithic (single server)?
YES → ✅ Use Vertical Scaling
  • Scale up your single instance
  • Faster to implement
  • Simpler architecture
NO → Continue to Q2
Q2: Do you need 99.9%+ availability?
YES → 🔄 Use Horizontal Scaling
  • Multiple instances for redundancy
  • Load balancer for distribution
  • No single point of failure
NO → Continue to Q3
Q3: Is traffic predictable and steady?
YES → ✅ Use Vertical Scaling
  • Right-size for steady load
  • Cost-effective for predictable workloads
  • Easier to manage
NO (unpredictable spikes) → 🔄 Use Horizontal Scaling
  • Auto-scaling groups
  • Scale out during spikes
  • Scale in during low traffic
Q4: Have you reached maximum instance size?
YES → ⚠️ Must Use Horizontal Scaling
  • Cannot scale vertically further
  • Migrate to horizontal architecture
  • Consider microservices
NO → Continue to Q5
Q5: Is your application stateless?
YES → 🔄 Prefer Horizontal Scaling
  • Easy to add more instances
  • Better for auto-scaling
  • More flexible
NO (stateful) → ✅ Use Vertical Scaling
  • State stored on single server
  • Easier than migrating state
  • Simpler architecture

Quick Comparison Table

FactorVertical ScalingHorizontal Scaling
ImplementationFast (minutes)Slow (days/weeks)
ComplexitySimpleComplex
AvailabilitySingle point of failureHigh availability
Cost (small scale)LowerHigher (overhead)
Cost (large scale)Higher (large instances expensive)Lower (commodity hardware)
Scalability LimitMaximum instance sizeUnlimited
DowntimeYes (2-5 min)No (rolling updates)
Best ForMonolithic, predictable trafficMicroservices, unpredictable traffic
Hybrid Approach: Many applications use both! Scale database vertically (RDS) and application servers horizontally (EC2 Auto Scaling Groups).

Cost Considerations

Let’s talk money. Vertical scaling isn’t free, and the costs add up faster than you might think.

  • EC2: Roughly doubles each time you go up a size. m5.large (~$70/month) → m5.xlarge (~$140/month) → m5.2xlarge (~$280/month). That’s a big jump.
  • RDS: Same story—pricing mirrors EC2 for similar instance types. Multi-AZ adds about 2x the cost but gives you high availability.
  • EBS: Storage is cheap ($0.10/GB/month for gp3), but if you need high IOPS (io1/io2), that’s where it gets expensive. I’ve seen bills spike from IOPS alone.
  • CloudWatch: First 10 alarms are free, then $0.10 each. Not bad, but if you’re setting up alarms for everything, it adds up.

Pro tip: Use the AWS Pricing Calculator before scaling. I’ve been surprised by costs more than once.

Remember: Vertical scaling has limits. When you reach maximum instance size, consider horizontal scaling (adding more instances).

Rollback Procedures

Here’s the thing: scaling can go wrong. I’ve had instances that wouldn’t start after changing types, applications that crashed on the new instance, and performance that was somehow worse. Always have a rollback plan—you’ll sleep better at night.

When to Rollback

  • ❌ Application fails to start after scaling
  • ❌ Performance is worse than before scaling
  • ❌ Application errors increase significantly
  • ❌ Health checks fail consistently
  • ❌ Critical bugs discovered after scaling
  • ❌ Cost exceeds budget unexpectedly

Manual Rollback (EC2)

Step-by-Step:

  1. Verify current state:
    # Check current instance type aws ec2 describe-instances –instance-ids i-xxx \ –query ‘Reservations[0].Instances[0].InstanceType’ –output text # Check application health curl http://localhost:3000/health
  2. Stop instance:
    aws ec2 stop-instances –instance-ids i-xxx aws ec2 wait instance-stopped –instance-ids i-xxx
  3. Revert to previous instance type:
    # Change back to original type (e.g., m5.xlarge → m5.large) aws ec2 modify-instance-attribute \ –instance-id i-xxx \ –instance-type Value=m5.large
  4. Start instance:
    aws ec2 start-instances –instance-ids i-xxx aws ec2 wait instance-running –instance-ids i-xxx
  5. Verify rollback:
    # Check instance type aws ec2 describe-instances –instance-ids i-xxx \ –query ‘Reservations[0].Instances[0].InstanceType’ –output text # Check application curl http://localhost:3000/health curl http://localhost:3000/health/detailed

Rollback Script (Key Commands)

# Stop instance aws ec2 stop-instances –instance-ids i-xxx aws ec2 wait instance-stopped –instance-ids i-xxx # Revert instance type aws ec2 modify-instance-attribute –instance-id i-xxx –instance-type Value=m5.large # Start instance aws ec2 start-instances –instance-ids i-xxx aws ec2 wait instance-running –instance-ids i-xxx # Verify curl http://localhost:3000/health

Automatic Rollback with Lambda

Lambda function structure for automatic rollback:

exports.handler = async (event) => { try { // 1. Capture original state const originalType = await getCurrentInstanceType(instanceId); // 2. Perform scaling await scaleInstance(instanceId, newType); // 3. Health check const healthCheck = await checkApplicationHealth(instanceId); if (!healthCheck) throw new Error(‘Health check failed’); return { statusCode: 200, body: ‘Scaling successful’ }; } catch (error) { // Automatic rollback await rollbackInstance(instanceId, originalType); throw error; } };

Rollback for RDS

RDS rollback requires restoring from snapshot or point-in-time recovery:

# Method 1: Restore from snapshot (if created before scaling) aws rds restore-db-instance-from-db-snapshot \ –db-instance-identifier mydb-restored \ –db-snapshot-identifier snapshot-before-scale \ –db-instance-class db.m5.large # Method 2: Point-in-time recovery (restore to time before scaling) aws rds restore-db-instance-to-point-in-time \ –source-db-instance-identifier mydb \ –target-db-instance-identifier mydb-restored \ –restore-time 2024-01-15T10:00:00Z \ –db-instance-class db.m5.large # Method 3: Simply modify back to previous class (if instance is still running) aws rds modify-db-instance \ –db-instance-identifier mydb \ –db-instance-class db.m5.large \ –apply-immediately

Rollback Best Practices

  • Always create snapshots before scaling (automated in scripts)
  • Document previous state (instance type, configuration)
  • Test rollback procedure in staging environment first
  • Set up alerts to notify team of rollbacks
  • Monitor after rollback to ensure application is stable
  • Keep rollback scripts ready and tested
  • Time limit for rollback – decide within 1-2 hours if rollback is needed
  • Post-mortem – analyze why scaling failed to prevent future issues
Pro Tip: Store rollback information in DynamoDB or S3 before scaling:
// Store state before scaling await dynamodb.put({ TableName: ‘scaling-history’, Item: { instanceId: instanceId, timestamp: Date.now(), oldType: currentType, newType: newType, snapshotId: snapshotId, status: ‘in-progress’ } }).promise();

Rollback Checklist

StepActionTime
1Identify issue requiring rollback5 min
2Stop instance (if needed)1-2 min
3Revert instance type30 sec
4Start instance1-2 min
5Verify application health2-5 min
6Notify team1 min
TotalComplete rollback5-15 min

Frequently Asked Questions

Q1: How long does vertical scaling take on EC2?

Answer: Usually 2-5 minutes, but I’ve seen it take longer. It really depends on how big your instance is and what’s running on it.

Breakdown:

  • Stopping instance: 30-60 seconds
  • Modifying instance type: 10-30 seconds
  • Starting instance: 60-180 seconds
  • Application startup: 30-120 seconds (depends on your app)

Example:

# Time the scaling process START_TIME=$(date +%s) # Stop instance aws ec2 stop-instances –instance-ids i-xxx aws ec2 wait instance-stopped –instance-ids i-xxx # Modify type aws ec2 modify-instance-attribute –instance-id i-xxx –instance-type Value=m5.xlarge # Start instance aws ec2 start-instances –instance-ids i-xxx aws ec2 wait instance-running –instance-ids i-xxx # Calculate duration END_TIME=$(date +%s) DURATION=$((END_TIME – START_TIME)) echo “Scaling took ${DURATION} seconds”

Tip: Use Multi-AZ for RDS to reduce downtime to ~30 seconds during scaling.

Q2: Will my IP address change when I scale vertically?

Answer: Yeah, probably. Unless you’re using an Elastic IP, AWS will give you a new public IP when you stop and start the instance. This bit me once when I forgot to set up an Elastic IP—had to update DNS records at 2 AM.

Solution 1: Use Elastic IP (Recommended)

# Allocate Elastic IP aws ec2 allocate-address –domain vpc # Associate with instance (before scaling) aws ec2 associate-address –instance-id i-xxx –allocation-id eipalloc-xxx # After scaling, re-associate (if needed) aws ec2 associate-address –instance-id i-xxx –allocation-id eipalloc-xxx

Solution 2: Update DNS Records

# Get new IP after scaling NEW_IP=$(aws ec2 describe-instances –instance-ids i-xxx –query ‘Reservations[0].Instances[0].PublicIpAddress’ –output text) # Update Route53 record aws route53 change-resource-record-sets –hosted-zone-id Z123456 –change-batch ‘{ “Changes”: [{ “Action”: “UPSERT”, “ResourceRecordSet”: { “Name”: “api.example.com”, “Type”: “A”, “TTL”: 300, “ResourceRecords”: [{“Value”: “‘$NEW_IP'”}] } }] }’

Q3: How do I prevent rapid scaling up and down (thrashing)?

Answer: This is a real problem. I’ve seen systems scale up and down every 10 minutes because the thresholds were too close together. You need cooldown periods and different thresholds for scaling up vs down.

Strategy:

  • Scale-up threshold: CPU > 75% for 10 minutes
  • Scale-down threshold: CPU < 40% for 30 minutes
  • Cooldown period: 30-60 minutes between scaling actions

Lambda Cooldown Check:

// Check if in cooldown period const lastScale = await dynamodb.get({ TableName: ‘scaling-cooldown’, Key: { instanceId } }).promise(); if (lastScale.Item && (Date.now() – lastScale.Item.timestamp) < 30 * 60 * 1000) { return { statusCode: 200, body: 'In cooldown' }; } // Record scaling time after scaling await dynamodb.put({ TableName: 'scaling-cooldown', Item: { instanceId, timestamp: Date.now() } }).promise();

Q4: What happens if I scale to maximum instance size and still need more?

Answer: You’ve hit the wall. At this point, you’ve got two options: optimize your application (which you should probably do anyway) or bite the bullet and go horizontal. I’ve been here—it’s not fun, but it’s doable.

Maximum Instance Sizes:

  • General Purpose (m5): m5.24xlarge (96 vCPU, 384 GB RAM)
  • Compute Optimized (c5): c5.24xlarge (96 vCPU, 192 GB RAM)
  • Memory Optimized (r5): r5.24xlarge (96 vCPU, 768 GB RAM)

Options:

  1. Optimize Application:
    • Database query optimization
    • Implement caching (Redis)
    • Code profiling and optimization
    • Connection pooling
  2. Migrate to Horizontal Scaling:
    • Split application into microservices
    • Use load balancer (ALB/NLB)
    • Deploy multiple instances
    • Use auto-scaling groups
  3. Hybrid Approach:
    • Keep database on large instance (vertical)
    • Scale application servers horizontally
    • Use read replicas for database

Q5: How do I test automatic scaling without actually scaling?

Answer: Use test mode in Lambda or create a separate test alarm with very low threshold.

Method 1: Test Mode in Lambda

if (process.env.TEST_MODE === ‘true’) { console.log(`Would scale from ${currentType} to ${newType}`); return { statusCode: 200, body: ‘Test mode – no scaling’ }; }

Method 2: Test Alarm with Low Threshold

# Create test alarm that triggers easily aws cloudwatch put-metric-alarm \ –alarm-name “Test-Scaling-Alarm” \ –metric-name CPUUtilization \ –namespace AWS/EC2 \ –statistic Average \ –period 60 \ –threshold 10 \ –comparison-operator GreaterThanThreshold \ –evaluation-periods 1 \ –alarm-actions arn:aws:lambda:REGION:ACCOUNT:function:test-scale-function

Q6: Can I scale down automatically when usage drops?

Answer: Yes, create a separate alarm with lower threshold for scale-down.

Example Setup:

# Scale-down alarm (CPU < 40% for 30 minutes) aws cloudwatch put-metric-alarm \ --alarm-name "Low-CPU-ScaleDown" \ --metric-name CPUUtilization \ --namespace AWS/EC2 \ --statistic Average \ --period 300 \ --threshold 40 \ --comparison-operator LessThanThreshold \ --dimensions Name=InstanceId,Value=i-xxx \ --evaluation-periods 6 \ --alarm-actions arn:aws:lambda:REGION:ACCOUNT:function:scale-down-instance

Scale-Down Lambda Function:

const SCALE_DOWN_PATH = { ‘m5.2xlarge’: ‘m5.xlarge’, ‘m5.xlarge’: ‘m5.large’, ‘m5.large’: ‘m5.medium’ }; // Check minimum size and cooldown if (!newType || currentType === ‘m5.medium’) { return { statusCode: 200, body: ‘At minimum’ }; } // Perform scale-down (same as scale-up)
Warning: Be careful with automatic scale-down. Ensure your application can handle the smaller instance size. Test thoroughly in staging first.

Q7: How do I monitor costs when using automatic scaling?

Answer: Set up AWS Budgets and CloudWatch billing alarms.

Step 1: Create Billing Alarm

# Enable billing alerts (one-time setup) aws cloudwatch put-metric-alarm \ –alarm-name “MonthlyCostAlert” \ –metric-name EstimatedCharges \ –namespace AWS/Billing \ –statistic Maximum \ –period 86400 \ –threshold 500 \ –comparison-operator GreaterThanThreshold \ –evaluation-periods 1 \ –alarm-actions arn:aws:sns:REGION:ACCOUNT:billing-alerts

Step 2: Track Instance Type Changes

// Log scaling events to CloudWatch Logs await cloudwatchLogs.putLogEvents({ logGroupName: ‘/aws/scaling/events’, logStreamName: instanceId, logEvents: [{ timestamp: Date.now(), message: JSON.stringify({ instanceId, oldType, newType, cost }) }] }).promise();

Step 3: Use AWS Cost Explorer

  • Go to AWS Console → Cost Explorer
  • Filter by service: EC2
  • Group by: Instance Type
  • Set date range to track costs over time

Q8: What’s the difference between scaling EC2 vs RDS vertically?

Answer: Similar process but different considerations for downtime and data.

AspectEC2RDS
Downtime2-5 minutes (must stop)30 seconds – 5 min (depends on Multi-AZ)
Multi-AZNot applicableMinimal downtime (~30 sec)
Data SafetyCreate snapshot manuallyAutomated backups
Scaling TimeFaster (2-5 min)Slower (5-30 min for large DBs)
RollbackEasy (change type back)More complex (restore from backup)

RDS Scaling Example:

# Scale RDS instance (Multi-AZ for minimal downtime) aws rds modify-db-instance \ –db-instance-identifier mydb \ –db-instance-class db.m5.xlarge \ –apply-immediately \ –multi-az # Monitor progress aws rds describe-db-instances \ –db-instance-identifier mydb \ –query ‘DBInstances[0].DBInstanceStatus’

Q9: How do I handle application state during scaling?

Answer: Design your application to be stateless or use external state storage.

Problem: In-memory sessions, caches, and state are lost when instance stops.

Solutions:

  1. Use External Session Storage:
    // Store sessions in Redis (external) app.use(session({ store: new RedisStore({ host: process.env.REDIS_HOST }), secret: ‘your-secret’ }));
  2. Use Database for State:
    // Store state in database await db.query( ‘INSERT INTO app_state (key, value) VALUES ($1, $2)’, [‘current_state’, JSON.stringify(state)] );
  3. Use EBS for Persistent Data:
    # Mount EBS volume for persistent storage sudo mkdir /data sudo mount /dev/xvdf /data # Configure application to use /data export DATA_DIR=/data

Q10: What happens if scaling fails mid-process?

Answer: Implement rollback mechanism and health checks.

Failure Scenarios:

  • Instance fails to stop
  • Instance fails to start after modification
  • Application fails to start on new instance
  • Network connectivity issues

Rollback Strategy:

try { // 1. Store original state const originalType = await getCurrentInstanceType(instanceId); // 2. Perform scaling await scaleInstance(instanceId, newType); // 3. Health check const healthCheck = await checkApplicationHealth(instanceId); if (!healthCheck) throw new Error(‘Health check failed’); } catch (error) { // Automatic rollback await ec2.modifyInstanceAttribute({ InstanceId: instanceId, InstanceType: { Value: originalType } }).promise(); await sns.publish({ TopicArn: snsTopic, Subject: ‘Rollback Complete’, Message: `Rolled back to ${originalType}` }).promise(); }

Complete Terraform Guide
Learn more about Rails
Learn more about Mern Stack
Learn more about DevOps
Learn more about AWS ECS Infrastructure guide

2 thoughts on “Vertical Scaling on AWS: Practical Guide with EC2 & RDS Examples”

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top