Lesson 4 of 15 4 minAdvanced Track

Designing a Production-Grade AWS VPC from Scratch

Build a multi-availability-zone Virtual Private Cloud (VPC) with segregated public, private, and database subnets using Terraform.

Reading Mode

Hide the curriculum rail and keep the lesson centered for focused reading.

Key Takeaways

  • A production VPC requires strict traffic segmentation using Public, Private, and Database subnet groups.
  • Distribute resources across at least 2 or 3 Availability Zones (AZs) for fault tolerance and high availability.
  • Database subnets must never contain public IP allocations or have direct route pathways to an Internet Gateway.
Recommended Prerequisites
terraform-aws-02-remote-state-bootstrap

Premium outcome

Provision, secure, and automate production-grade cloud infrastructure at scale.

Backend and platform engineers who want to design, deploy, and automate robust production environments on AWS.

You leave with

  • A secure, modular, multi-environment AWS landing zone designed from scratch
  • A fully integrated GitOps deployment pipeline using GitHub Actions and Terraform S3 Backend
  • Hands-on expertise deploying containerized microservices (ECS Fargate + RDS) with secure IAM gating

Designing a Production-Grade AWS VPC from Scratch

A Virtual Private Cloud (VPC) is the logical network boundary for all your cloud resources. By default, AWS provides a "Default VPC" in every account. However, default VPCs assign public IP addresses to all resources and route all subnets to the public internet. This is a severe security risk.

A production-grade architecture must implement Defense in Depth by segmenting the network into logical, isolated layers across multiple Availability Zones (AZs).

+-----------------------------------------------------------------------------------+
| AWS Cloud Region (us-east-1)                                                      |
| VPC: 10.0.0.0/16                                                                  |
|                                                                                   |
|    +--------------------------+          +--------------------------+             |
|    | Availability Zone A      |          | Availability Zone B      |             |
|    |                          |          |                          |             |
|    |  Public Subnet (10.0.1)  |          |  Public Subnet (10.0.2)  |  <- ALB     |
|    |                          |          |                          |             |
|    |  Private Subnet (10.0.11)|          |  Private Subnet (10.0.12)|  <- App     |
|    |                          |          |                          |             |
|    |  DB Subnet (10.0.21)     |          |  DB Subnet (10.0.22)     |  <- RDS     |
|    +--------------------------+          +--------------------------+             |
|                                                                                   |
+-----------------------------------------------------------------------------------+

Understanding Subnet Segregation

  1. Public Subnet Layer
    • Hosts public-facing Application Load Balancers (ALBs) and NAT Gateways.
    • Has a route to an Internet Gateway (IGW).
    • Assigns public IP addresses on launch.
  2. Private Subnet Layer
    • Hosts stateless application container workloads (ECS Fargate/EKS) and background workers.
    • Routes outbound traffic to the public internet through a NAT Gateway in the public subnet.
    • Never assigns public IPs.
  3. Database / Isolated Subnet Layer
    • Hosts transactional databases (RDS) and caching clusters (ElastiCache).
    • Has zero outbound or inbound internet access.
    • Can only communicate with the private application layer.

Hands-on: Provisioning the VPC Topology

Let's write a reusable, custom Terraform module to provision this network. Create a folder modules/vpc/ and add the following files:

# modules/vpc/variables.tf

variable "vpc_cidr" {
  description = "Base CIDR block for the VPC"
  type        = string
  default     = "10.0.0.0/16"
}

variable "environment" {
  description = "Environment name (dev, staging, prod)"
  type        = string
}

variable "availability_zones" {
  description = "List of Availability Zones to provision subnets in"
  type        = list(string)
  default     = ["us-east-1a", "us-east-1b"]
}

Now, implement the main networking resources:

# modules/vpc/main.tf

# 1. The Core VPC
resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name        = "${var.environment}-vpc"
    Environment = var.environment
  }
}

# 2. Internet Gateway for Public Routing
resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name        = "${var.environment}-igw"
    Environment = var.environment
  }
}

# 3. Public Subnets
resource "aws_subnet" "public" {
  count                   = length(var.availability_zones)
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.${count.index + 1}.0/24"
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = true # Instances receive public IP automatically

  tags = {
    Name        = "${var.environment}-public-subnet-${var.availability_zones[count.index]}"
    Environment = var.environment
  }
}

# 4. Private Subnets
resource "aws_subnet" "private" {
  count                   = length(var.availability_zones)
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.${count.index + 11}.0/24"
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = false

  tags = {
    Name        = "${var.environment}-private-subnet-${var.availability_zones[count.index]}"
    Environment = var.environment
  }
}

# 5. Database Subnets (Isolated)
resource "aws_subnet" "database" {
  count                   = length(var.availability_zones)
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.${count.index + 21}.0/24"
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = false

  tags = {
    Name        = "${var.environment}-db-subnet-${var.availability_zones[count.index]}"
    Environment = var.environment
  }
}

# RDS database subnet group
resource "aws_db_subnet_group" "rds" {
  name        = "${var.environment}-rds-subnet-group"
  description = "Subnet group for RDS cluster"
  subnet_ids  = aws_subnet.database[*].id

  tags = {
    Name        = "${var.environment}-rds-subnet-group"
    Environment = var.environment
  }
}

Now, expose the necessary identifiers using outputs:

# modules/vpc/outputs.tf

output "vpc_id" {
  value = aws_vpc.main.id
}

output "public_subnet_ids" {
  value = aws_subnet.public[*].id
}

output "private_subnet_ids" {
  value = aws_subnet.private[*].id
}

output "db_subnet_ids" {
  value = aws_subnet.database[*].id
}

output "db_subnet_group_name" {
  value = aws_db_subnet_group.rds.name
}

Step 3: Configure Public Route Tables

For our public subnets to act as public subnets, they must route target 0.0.0.0/0 (internet) requests to the Internet Gateway:

# modules/vpc/main.tf (continued)

# Public Route Table
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }

  tags = {
    Name        = "${var.environment}-public-rt"
    Environment = var.environment
  }
}

# Associate Public Route Table to Public Subnets
resource "aws_route_table_association" "public" {
  count          = length(var.availability_zones)
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

Next Steps

Our VPC is now provisioned with high-availability public, private, and database boundaries across multiple availability zones. However, if we deploy resources inside the private subnet, they won't be able to fetch code packages, call third-party APIs, or download updates.

In the next lesson, we will provision NAT Gateways to enable secure egress routing for our private subnets and secure port boundaries using stateful Security Groups.

Want to track your progress?

Sign in to save your progress, track completed lessons, and pick up where you left off.