Introdução

Infrastructure as Code (IaC) revolucionou a forma como gerenciamos infraestrutura na nuvem. O Terraform, combinado com a AWS, oferece uma solução poderosa para criar, modificar e versionar infraestrutura de forma declarativa e reproduzível.

Por que Terraform + AWS?

Vantagens do Terraform

  • Multi-cloud - Suporte a múltiplos provedores
  • Declarativo - Descreve o estado desejado
  • Planejamento - Preview das mudanças antes da aplicação
  • State Management - Controle de estado centralizado
  • Modularidade - Reutilização de código

Benefícios na AWS

  • 🚀 Escalabilidade - Infraestrutura que cresce com demanda
  • 🔒 Segurança - Controles integrados de segurança
  • 💰 Custo-efetivo - Otimização automática de recursos
  • 🔄 Automação - Deploy e gestão automatizados

Estrutura de Projeto Terraform

Organização de Diretórios

terraform-aws-infrastructure/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── terraform.tfvars
│   ├── staging/
│   └── production/
├── modules/
│   ├── vpc/
│   ├── ec2/
│   ├── rds/
│   ├── s3/
│   └── security-groups/
├── shared/
│   ├── backend.tf
│   └── providers.tf
└── scripts/
    ├── deploy.sh
    └── destroy.sh

Configuração Base

Provider Configuration

# providers.tf
terraform {
  required_version = ">= 1.0"
  
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "~> 3.1"
    }
  }
  
  backend "s3" {
    bucket         = "terraform-state-bucket"
    key            = "infrastructure/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

provider "aws" {
  region = var.aws_region
  
  default_tags {
    tags = {
      Environment   = var.environment
      Project       = var.project_name
      ManagedBy     = "Terraform"
      Owner         = var.owner
      CostCenter    = var.cost_center
    }
  }
}

Variables Configuration

# variables.tf
variable "aws_region" {
  description = "AWS region for resources"
  type        = string
  default     = "us-east-1"
}

variable "environment" {
  description = "Environment name"
  type        = string
  validation {
    condition = contains(["dev", "staging", "production"], var.environment)
    error_message = "Environment must be dev, staging, or production."
  }
}

variable "project_name" {
  description = "Name of the project"
  type        = string
}

variable "vpc_cidr" {
  description = "CIDR block for VPC"
  type        = string
  default     = "10.0.0.0/16"
}

variable "availability_zones" {
  description = "List of availability zones"
  type        = list(string)
  default     = ["us-east-1a", "us-east-1b", "us-east-1c"]
}

Módulos Terraform Reutilizáveis

Módulo VPC

# modules/vpc/main.tf
resource "aws_vpc" "main" {
  cidr_block           = var.cidr_block
  enable_dns_hostnames = true
  enable_dns_support   = true
  
  tags = {
    Name = "${var.name}-vpc"
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
  
  tags = {
    Name = "${var.name}-igw"
  }
}

resource "aws_subnet" "public" {
  count = length(var.public_subnets)
  
  vpc_id                  = aws_vpc.main.id
  cidr_block              = var.public_subnets[count.index]
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = true
  
  tags = {
    Name = "${var.name}-public-${count.index + 1}"
    Type = "Public"
  }
}

resource "aws_subnet" "private" {
  count = length(var.private_subnets)
  
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.private_subnets[count.index]
  availability_zone = var.availability_zones[count.index]
  
  tags = {
    Name = "${var.name}-private-${count.index + 1}"
    Type = "Private"
  }
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }
  
  tags = {
    Name = "${var.name}-public-rt"
  }
}

resource "aws_route_table_association" "public" {
  count = length(aws_subnet.public)
  
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

# NAT Gateway para subnets privadas
resource "aws_eip" "nat" {
  count = var.enable_nat_gateway ? length(var.public_subnets) : 0
  
  domain = "vpc"
  
  tags = {
    Name = "${var.name}-nat-eip-${count.index + 1}"
  }
  
  depends_on = [aws_internet_gateway.main]
}

resource "aws_nat_gateway" "main" {
  count = var.enable_nat_gateway ? length(var.public_subnets) : 0
  
  allocation_id = aws_eip.nat[count.index].id
  subnet_id     = aws_subnet.public[count.index].id
  
  tags = {
    Name = "${var.name}-nat-${count.index + 1}"
  }
}

resource "aws_route_table" "private" {
  count = var.enable_nat_gateway ? length(var.private_subnets) : 0
  
  vpc_id = aws_vpc.main.id
  
  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.main[count.index].id
  }
  
  tags = {
    Name = "${var.name}-private-rt-${count.index + 1}"
  }
}

resource "aws_route_table_association" "private" {
  count = var.enable_nat_gateway ? length(aws_subnet.private) : 0
  
  subnet_id      = aws_subnet.private[count.index].id
  route_table_id = aws_route_table.private[count.index].id
}

Módulo Security Groups

# modules/security-groups/main.tf
resource "aws_security_group" "web" {
  name_prefix = "${var.name}-web-"
  vpc_id      = var.vpc_id
  description = "Security group for web servers"
  
  ingress {
    description = "HTTP"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  ingress {
    description = "HTTPS"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  tags = {
    Name = "${var.name}-web-sg"
  }
  
  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_security_group" "database" {
  name_prefix = "${var.name}-db-"
  vpc_id      = var.vpc_id
  description = "Security group for database servers"
  
  ingress {
    description     = "MySQL/Aurora"
    from_port       = 3306
    to_port         = 3306
    protocol        = "tcp"
    security_groups = [aws_security_group.web.id]
  }
  
  ingress {
    description     = "PostgreSQL"
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.web.id]
  }
  
  tags = {
    Name = "${var.name}-db-sg"
  }
  
  lifecycle {
    create_before_destroy = true
  }
}

Módulo EC2 com Auto Scaling

# modules/ec2/main.tf
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]
  
  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}

resource "aws_launch_template" "web" {
  name_prefix   = "${var.name}-web-"
  image_id      = data.aws_ami.amazon_linux.id
  instance_type = var.instance_type
  
  vpc_security_group_ids = var.security_group_ids
  
  user_data = base64encode(templatefile("${path.module}/user_data.sh", {
    environment = var.environment
  }))
  
  iam_instance_profile {
    name = aws_iam_instance_profile.web.name
  }
  
  block_device_mappings {
    device_name = "/dev/xvda"
    ebs {
      volume_size = var.root_volume_size
      volume_type = "gp3"
      encrypted   = true
    }
  }
  
  tag_specifications {
    resource_type = "instance"
    tags = {
      Name = "${var.name}-web-instance"
    }
  }
  
  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_autoscaling_group" "web" {
  name                = "${var.name}-web-asg"
  vpc_zone_identifier = var.subnet_ids
  target_group_arns   = var.target_group_arns
  health_check_type   = "ELB"
  health_check_grace_period = 300
  
  min_size         = var.min_size
  max_size         = var.max_size
  desired_capacity = var.desired_capacity
  
  launch_template {
    id      = aws_launch_template.web.id
    version = "$Latest"
  }
  
  tag {
    key                 = "Name"
    value               = "${var.name}-web-asg"
    propagate_at_launch = false
  }
  
  instance_refresh {
    strategy = "Rolling"
    preferences {
      min_healthy_percentage = 50
    }
  }
}

# IAM Role para instâncias EC2
resource "aws_iam_role" "web" {
  name = "${var.name}-web-role"
  
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_instance_profile" "web" {
  name = "${var.name}-web-profile"
  role = aws_iam_role.web.name
}

resource "aws_iam_role_policy_attachment" "web_ssm" {
  role       = aws_iam_role.web.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

Implementação de Ambientes

Ambiente de Desenvolvimento

# environments/dev/main.tf
module "vpc" {
  source = "../../modules/vpc"
  
  name               = "${var.project_name}-${var.environment}"
  cidr_block         = var.vpc_cidr
  availability_zones = var.availability_zones
  public_subnets     = ["10.0.1.0/24", "10.0.2.0/24"]
  private_subnets    = ["10.0.10.0/24", "10.0.20.0/24"]
  enable_nat_gateway = false  # Economia de custos em dev
}

module "security_groups" {
  source = "../../modules/security-groups"
  
  name   = "${var.project_name}-${var.environment}"
  vpc_id = module.vpc.vpc_id
}

module "web_servers" {
  source = "../../modules/ec2"
  
  name               = "${var.project_name}-${var.environment}"
  environment        = var.environment
  instance_type      = "t3.micro"
  min_size           = 1
  max_size           = 2
  desired_capacity   = 1
  subnet_ids         = module.vpc.public_subnet_ids
  security_group_ids = [module.security_groups.web_sg_id]
}

Ambiente de Produção

# environments/production/main.tf
module "vpc" {
  source = "../../modules/vpc"
  
  name               = "${var.project_name}-${var.environment}"
  cidr_block         = var.vpc_cidr
  availability_zones = var.availability_zones
  public_subnets     = ["10.1.1.0/24", "10.1.2.0/24", "10.1.3.0/24"]
  private_subnets    = ["10.1.10.0/24", "10.1.20.0/24", "10.1.30.0/24"]
  enable_nat_gateway = true
}

module "security_groups" {
  source = "../../modules/security-groups"
  
  name   = "${var.project_name}-${var.environment}"
  vpc_id = module.vpc.vpc_id
}

module "web_servers" {
  source = "../../modules/ec2"
  
  name               = "${var.project_name}-${var.environment}"
  environment        = var.environment
  instance_type      = "t3.medium"
  min_size           = 2
  max_size           = 10
  desired_capacity   = 3
  subnet_ids         = module.vpc.private_subnet_ids
  security_group_ids = [module.security_groups.web_sg_id]
}

module "database" {
  source = "../../modules/rds"
  
  name                = "${var.project_name}-${var.environment}"
  engine              = "mysql"
  engine_version      = "8.0"
  instance_class      = "db.t3.medium"
  allocated_storage   = 100
  subnet_ids          = module.vpc.private_subnet_ids
  security_group_ids  = [module.security_groups.database_sg_id]
  backup_retention    = 7
  multi_az           = true
}

Automação e CI/CD

Pipeline GitLab CI

# .gitlab-ci.yml
stages:
  - validate
  - plan
  - apply
  - destroy

variables:
  TF_ROOT: ${CI_PROJECT_DIR}
  TF_ADDRESS: ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_ENVIRONMENT_NAME}

cache:
  key: "${TF_ROOT}"
  paths:
    - ${TF_ROOT}/.terraform

before_script:
  - cd ${TF_ROOT}/environments/${CI_ENVIRONMENT_NAME}
  - terraform --version
  - terraform init -backend-config="address=${TF_ADDRESS}" -backend-config="lock_address=${TF_ADDRESS}/lock" -backend-config="unlock_address=${TF_ADDRESS}/lock" -backend-config="username=${CI_USERNAME}" -backend-config="password=${CI_JOB_TOKEN}" -backend-config="lock_method=POST" -backend-config="unlock_method=DELETE" -backend-config="retry_wait_min=5"

validate:
  stage: validate
  script:
    - terraform validate
    - terraform fmt -check
  only:
    - merge_requests
    - main

plan:
  stage: plan
  script:
    - terraform plan -out="planfile"
  artifacts:
    name: plan
    paths:
      - ${TF_ROOT}/environments/${CI_ENVIRONMENT_NAME}/planfile
  only:
    - merge_requests
    - main

apply:
  stage: apply
  script:
    - terraform apply -input=false "planfile"
  dependencies:
    - plan
  when: manual
  only:
    - main
  environment:
    name: ${CI_ENVIRONMENT_NAME}

destroy:
  stage: destroy
  script:
    - terraform destroy -auto-approve
  when: manual
  only:
    - main
  environment:
    name: ${CI_ENVIRONMENT_NAME}
    action: stop

Scripts de Automação

#!/bin/bash
# scripts/deploy.sh

set -e

ENVIRONMENT=${1:-dev}
ACTION=${2:-plan}

echo "🚀 Deploying to $ENVIRONMENT environment"

cd "environments/$ENVIRONMENT"

# Inicializar Terraform
terraform init

# Validar configuração
terraform validate

# Formatar código
terraform fmt -recursive

case $ACTION in
  "plan")
    echo "📋 Planning infrastructure changes..."
    terraform plan -var-file="terraform.tfvars"
    ;;
  "apply")
    echo "🔨 Applying infrastructure changes..."
    terraform plan -var-file="terraform.tfvars" -out=tfplan
    terraform apply tfplan
    rm tfplan
    ;;
  "destroy")
    echo "💥 Destroying infrastructure..."
    terraform plan -destroy -var-file="terraform.tfvars" -out=tfplan
    terraform apply tfplan
    rm tfplan
    ;;
  *)
    echo "❌ Invalid action. Use: plan, apply, or destroy"
    exit 1
    ;;
esac

echo "✅ Operation completed successfully!"

Monitoramento e Observabilidade

CloudWatch Integration

# modules/monitoring/main.tf
resource "aws_cloudwatch_dashboard" "main" {
  dashboard_name = "${var.name}-infrastructure"
  
  dashboard_body = jsonencode({
    widgets = [
      {
        type   = "metric"
        x      = 0
        y      = 0
        width  = 12
        height = 6
        
        properties = {
          metrics = [
            ["AWS/EC2", "CPUUtilization", "AutoScalingGroupName", var.asg_name],
            ["AWS/ApplicationELB", "TargetResponseTime", "LoadBalancer", var.alb_name]
          ]
          period = 300
          stat   = "Average"
          region = var.aws_region
          title  = "Infrastructure Metrics"
        }
      }
    ]
  })
}

resource "aws_cloudwatch_metric_alarm" "high_cpu" {
  alarm_name          = "${var.name}-high-cpu"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = "2"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = "300"
  statistic           = "Average"
  threshold           = "80"
  alarm_description   = "This metric monitors ec2 cpu utilization"
  
  dimensions = {
    AutoScalingGroupName = var.asg_name
  }
  
  alarm_actions = [aws_sns_topic.alerts.arn]
}

resource "aws_sns_topic" "alerts" {
  name = "${var.name}-infrastructure-alerts"
}

Segurança e Compliance

Terraform Security Scanning

# .github/workflows/security-scan.yml
name: Security Scan

on:
  pull_request:
    branches: [main]

jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Run Checkov
        uses: bridgecrewio/checkov-action@master
        with:
          directory: .
          framework: terraform
          output_format: sarif
          output_file_path: reports/results.sarif
          
      - name: Run TFSec
        uses: aquasecurity/[email protected]
        with:
          soft_fail: true
          
      - name: Run Terrascan
        uses: accurics/terrascan-action@main
        with:
          iac_type: terraform
          iac_version: v14
          policy_type: aws

State File Security

# Backend configuration with encryption
terraform {
  backend "s3" {
    bucket         = "terraform-state-secure-bucket"
    key            = "infrastructure/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    kms_key_id     = "arn:aws:kms:us-east-1:account:key/key-id"
    dynamodb_table = "terraform-locks"
    
    # Versioning enabled on S3 bucket
    versioning = true
    
    # Server-side encryption
    server_side_encryption_configuration {
      rule {
        apply_server_side_encryption_by_default {
          sse_algorithm     = "aws:kms"
          kms_master_key_id = "arn:aws:kms:us-east-1:account:key/key-id"
        }
      }
    }
  }
}

Melhores Práticas

1. Organização de Código

  • ✅ Use módulos reutilizáveis
  • ✅ Separe ambientes em diretórios
  • ✅ Mantenha arquivos pequenos e focados
  • ✅ Use naming conventions consistentes
  • ✅ Documente módulos e variáveis

2. Gestão de Estado

  • ✅ Use backend remoto (S3 + DynamoDB)
  • ✅ Habilite versionamento do state
  • ✅ Configure locks para evitar conflitos
  • ✅ Criptografe state files
  • ✅ Faça backup regular do estado

3. Segurança

  • ✅ Use least privilege principle
  • ✅ Criptografe dados em trânsito e repouso
  • ✅ Implemente resource tagging
  • ✅ Use secrets management
  • ✅ Faça security scanning regular

4. Performance e Custos

  • ✅ Use data sources para recursos existentes
  • ✅ Implemente lifecycle rules
  • ✅ Monitore custos com tags
  • ✅ Use spot instances quando apropriado
  • ✅ Otimize storage classes

Troubleshooting Comum

1. State Lock Issues

# Forçar unlock (use com cuidado)
terraform force-unlock LOCK_ID

# Verificar estado atual
terraform show

# Importar recursos existentes
terraform import aws_instance.example i-1234567890abcdef0

2. Dependency Issues

# Explicit dependencies
resource "aws_instance" "web" {
  # ... configuration ...
  
  depends_on = [
    aws_security_group.web,
    aws_subnet.public
  ]
}

3. Provider Version Conflicts

# Lock provider versions
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "= 5.0.1"  # Exact version
    }
  }
}

Conclusão

O Terraform oferece uma plataforma robusta para implementar Infrastructure as Code na AWS. Seguindo as melhores práticas apresentadas, você pode:

  1. Automatizar completamente sua infraestrutura
  2. Padronizar deployments entre ambientes
  3. Versionar mudanças de infraestrutura
  4. Reduzir erros manuais
  5. Acelerar time-to-market

Próximos Passos

  1. Implementar módulos básicos
  2. Configurar pipeline CI/CD
  3. Adicionar monitoramento
  4. Implementar security scanning
  5. Treinar equipe em IaC

Recursos Adicionais: