Blog / DevOps

Terraform at Scale: IaC Best Practices for Multi-Cloud Teams

Cesar A. Nogueira Cesar A. Nogueira
April 22, 2025 Β· 8 min read

Terraform is the de-facto standard for cloud infrastructure β€” but teams that don't adopt disciplined practices early pay a steep tax later: state drift, module sprawl, untestable configurations, and the ever-present fear of terraform apply going wrong in production. This guide distils the patterns we apply across multi-cloud engagements at UP2CLOUD.

Module Structure: Think Libraries, Not Scripts

The most common anti-pattern we see is a flat repository: one directory, hundreds of resources, everything tangled together. Treat your Terraform code like a software library with three distinct layers:

# Repository layout
infrastructure/
  modules/
    gcs-bucket/          # Resource module
    rds-instance/        # Resource module
    web-service/         # Composition module
  environments/
    prod/
      main.tf            # Calls web-service module
      variables.tf
      terraform.tfvars
    staging/
      main.tf
      variables.tf

Remote State: Never Use Local State in Teams

Local state files and version control do not mix. Use GCS or S3 for remote state with state locking enabled via DynamoDB (AWS) or native GCS locking.

# GCS backend configuration
terraform {
  backend "gcs" {
    bucket  = "mycompany-tf-state-prod"
    prefix  = "web-service/prod"
  }
}

# S3 + DynamoDB backend
terraform {
  backend "s3" {
    bucket         = "mycompany-tf-state"
    key            = "web-service/prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-locks"
    encrypt        = true
  }
}

Critical rule: one state file per environment per service. Monolithic state files that hold hundreds of resources become blast-radius nightmares β€” a failed apply can lock your entire infrastructure.

Workspace Strategy: Environments via Variables, Not Workspaces

Terraform workspaces are often misused for environment separation. The problem: workspaces share backend configuration and are easy to confuse. We recommend separate root configuration directories per environment with a shared modules library. Use workspaces only for ephemeral feature environments that mirror a single base configuration.

Policy as Code: Sentinel and OPA

Governance without enforcement is just documentation. Two mature policy-as-code solutions integrate with Terraform:

# OPA policy: deny resources without required tags
package terraform

deny[msg] {
  resource := input.resource_changes[_]
  resource.change.actions[_] == "create"
  not resource.change.after.tags.env
  msg := sprintf("Resource %v missing required 'env' tag", [resource.address])
}

CI/CD Integration: GitHub Actions + Terraform Cloud

Every terraform plan should run automatically on pull requests; every terraform apply should be gated on merge to main and require approval. Here's a minimal but production-ready GitHub Actions workflow:

name: Terraform Plan
on:
  pull_request:
    paths: ['infrastructure/**']

jobs:
  plan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: "1.8.0"
          cli_config_credentials_token: ${{ secrets.TF_API_TOKEN }}
      - name: Terraform Init
        run: terraform init
        working-directory: infrastructure/environments/prod
      - name: Terraform Validate
        run: terraform validate
        working-directory: infrastructure/environments/prod
      - name: OPA Policy Check
        run: |
          terraform show -json tfplan.binary | conftest test -
        working-directory: infrastructure/environments/prod
      - name: Post Plan to PR
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({...})

Avoiding State Drift

State drift occurs when real infrastructure diverges from Terraform state β€” usually because someone made a console change. Three defences:

Practical Tips Across Multi-Cloud Setups

When managing GCP and AWS from the same repository, use provider aliasing and clearly namespaced modules. Never mix provider-specific resource modules in the same directory. Keep a versions.tf that pins both the Terraform version and all provider versions β€” provider upgrades are the most common source of unexpected plan diffs in multi-cloud codebases.

Terraform at scale rewards investment in module quality and governance tooling. Teams that treat their IaC like a production software codebase β€” with reviews, tests, policies, and automated checks β€” spend far less time firefighting drift and far more time shipping value.

IaC Consulting

Need help scaling your Terraform codebase?

We help teams design module libraries, implement policy-as-code guardrails, and build CI/CD pipelines that make infrastructure changes safe and fast.

Let's Talk