Securing Subnet Access with NAT Gateways and Security Groups
In our previous lesson, we designed a multi-AZ VPC with segregated public, private, and database subnets. However, our stateless backend applications sitting inside the private subnets are currently entirely cut off from the external world. They cannot download npm/mvn packages, fetch API keys from dynamic providers, or communicate with external payment gateways like Stripe.
To solve this securely, we must implement NAT Gateways for outbound egress and design highly restrictive Security Groups to control inbound traffic.
Understanding NAT Gateways (Network Address Translation)
A NAT Gateway is a managed AWS service that enables instances in a private subnet to connect to the internet or other AWS services, while preventing the internet from initiating a connection directly to those instances.
[ Private Instance: 10.0.11.5 ]
│ (Inbound blocked, outbound allowed)
▼
[ NAT Gateway: 10.0.1.25 (Public Subnet) ]
│ (Translates 10.0.11.5 -> Elastic IP: 54.120.32.4)
▼
[ Internet Gateway ] ──> [ External Internet (e.g. Stripe API) ]
Cost Optimization Architectural Choice:
- Production: Provision 1 NAT Gateway per Availability Zone (Multi-AZ resilience). If one AZ goes down, other subnets still have working gateways.
- Development/Staging: Provision 1 single NAT Gateway shared across all private subnets. This saves ~$35/month per gateway, as NAT Gateways are billed at an hourly rate plus processing fees.
Step 1: Provisioning the NAT Gateway & Elastic IP
Let's expand our VPC module to include the NAT Gateways:
# modules/vpc/main.tf (continued)
# Allocate Elastic IP (EIP) for NAT Gateways
resource "aws_eip" "nat" {
count = var.environment == "prod" ? length(var.availability_zones) : 1
domain = "vpc"
tags = {
Name = "${var.environment}-nat-eip-${count.index}"
Environment = var.environment
}
}
# Create the NAT Gateways in Public Subnets
resource "aws_nat_gateway" "nat" {
count = var.environment == "prod" ? length(var.availability_zones) : 1
allocation_id = aws_eip.nat[count.index].id
# NAT Gateway must be placed in a PUBLIC subnet
subnet_id = aws_subnet.public[count.index].id
tags = {
Name = "${var.environment}-nat-gw-${count.index}"
Environment = var.environment
}
depends_on = [aws_internet_gateway.igw]
}
Step 2: Configuring Route Tables for Private Subnets
Now we must route outbound private traffic (0.0.0.0/0) through our NAT Gateways:
# Create Private Route Tables
resource "aws_route_table" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
# In Dev/Staging, route all AZs to the single NAT Gateway [0]
# In Prod, route each AZ to its corresponding NAT Gateway [count.index]
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = var.environment == "prod" ? aws_nat_gateway.nat[count.index].id : aws_nat_gateway.nat[0].id
}
tags = {
Name = "${var.environment}-private-rt-${var.availability_zones[count.index]}"
Environment = var.environment
}
}
# Associate Private Route Table to Private Subnets
resource "aws_route_table_association" "private" {
count = length(var.availability_zones)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[count.index].id
}
Step 3: Security Groups vs. NACLs
AWS provides two layers of firewall protection:
- Network ACLs (NACLs): Stateless, applied at the subnet boundary. They evaluate traffic rules sequentially and require configuring both inbound and outbound ports manually.
- Security Groups: Stateful, applied at the specific resource interface (ENI) level. If an inbound request is authorized on port
80, the outbound response is automatically allowed, regardless of outbound rules.
Best Practice: Layered Security Groups
We will define strict, stateful Security Groups that chain access between the Application Load Balancer, the Application containers, and the RDS database.
graph LR
User([User Request]) -->|Port 443| SG_ALB[ALB Security Group]
SG_ALB -->|Port 8080| SG_App[Application Container SG]
SG_App -->|Port 5432| SG_DB[RDS Database SG]
Let's write this layout configuration:
# main.tf
# 1. Security Group for Public ALB
resource "aws_security_group" "alb" {
name = "${var.environment}-alb-sg"
description = "Allows public inbound traffic to Load Balancer"
vpc_id = var.vpc_id
# Allow HTTP
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# Allow HTTPS
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# Allow all outbound traffic
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
# 2. Security Group for stateless microservice containers
resource "aws_security_group" "app" {
name = "${var.environment}-app-sg"
description = "Allows traffic from ALB Security Group only"
vpc_id = var.vpc_id
ingress {
from_port = 8080 # App listening port
to_port = 8080
protocol = "tcp"
security_groups = [aws_security_group.alb.id] # Reference ALB Security Group id
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
# 3. Security Group for database
resource "aws_security_group" "database" {
name = "${var.environment}-db-sg"
description = "Allows traffic from App Security Group only"
vpc_id = var.vpc_id
ingress {
from_port = 5432 # Postgres port
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.app.id] # Only App can talk to DB
}
# Block all egress from database
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
By chaining security_groups instead of CIDR blocks, we ensure that even if an attacker gains control of a server in the public subnet, they cannot connect to the database. The database will only receive requests originating from instances belonging to the aws_security_group.app group.
Next Steps
Our network architecture is robustly designed and secured. Now we must turn our attention to identity management. Before we can spin up applications that write to S3 buckets, publish events to SQS queues, or fetch secrets from Secrets Manager, we need to understand how AWS handles authentication and authorization.
In the next lesson, we will cover IAM Least Privilege, AssumeRole mechanics, and setting up secure OpenID Connect (OIDC) identities for our GitOps pipelines.