
Getting one environment running in Terraform is a manageable amount of work. Getting three environments running identically, with isolated state, different instance sizes, and separate secrets, turns into a project of its own.
Terraform doesn't have a built-in opinion on how to handle multiple environments. The community has settled on three main approaches over the years, each with tradeoffs that compound as teams grow. This guide covers all three, plus alternatives like Terragrunt, AWS CDK, and infrastructure-from-code platforms like Encore where environments are created automatically without per-environment configuration.
Terraform workspaces were the first official answer. You create a workspace per environment, and Terraform maintains separate state files for each:
terraform workspace new staging terraform workspace new production
Inside your config, you reference terraform.workspace to vary behavior:
resource "aws_db_instance" "main" {
instance_class = terraform.workspace == "production" ? "db.r6g.large" : "db.t3.micro"
# ...
}
The appeal is obvious. One set of .tf files, multiple environments. In practice, though, workspaces share the same backend configuration. Your staging and production state files sit in the same S3 bucket, controlled by the same IAM role. A terraform destroy in the wrong workspace takes down production, and the only thing protecting you is remembering to run terraform workspace select before every command.
State isolation is the deeper issue. Because all workspaces share the backend, you can't give your staging environment a different set of AWS credentials, a different state-locking table, or a different access policy. For small projects this is fine. For anything with compliance requirements or multiple team members, it breaks down quickly.
The Terraform documentation itself notes that workspaces are not a suitable mechanism for full environment isolation. They were designed for testing changes against parallel infrastructure, not for modeling the dev-staging-production lifecycle most teams need.
The most common workaround is to create a directory per environment:
infra/ modules/ vpc/ database/ ecs/ environments/ dev/ main.tf variables.tf terraform.tfvars staging/ main.tf variables.tf terraform.tfvars production/ main.tf variables.tf terraform.tfvars
Each directory has its own backend config, its own state, and its own variable values. This gives you the isolation that workspaces lack.
The cost is duplication. When you add a new module or change a resource configuration, you apply that change across every directory. Teams start disciplined about this, keeping the environments in sync through careful copy-paste. Over months, the directories diverge. A developer updates the staging VPC configuration to test something and forgets to propagate the change. Six months later, someone discovers that production's security group rules haven't matched staging since Q2.
The sync problem gets worse with scale. Five environments across two regions means ten directories sharing the same logical infrastructure with slight per-environment variations buried in tfvars files. Reviewing a pull request becomes an exercise in diffing directories to make sure the change was applied consistently.
A lighter variation uses a single configuration directory with per-environment .tfvars files:
terraform plan -var-file=environments/staging.tfvars terraform apply -var-file=environments/production.tfvars
This avoids directory duplication, but the state isolation problem from workspaces returns. You need wrapper scripts or CI pipeline logic to make sure the right backend, the right var file, and the right workspace all align for each run. One misconfigured CI job and you've applied staging variables to production state.
Tfvars files also sprawl. What starts as instance_size and db_instance_class grows to include feature flags, domain names, certificate ARNs, secret references, and conditional resource counts. You end up with 80-line tfvars files that function as a parallel configuration language layered on top of HCL.
Terragrunt addresses the duplication problem directly. It wraps Terraform with a terragrunt.hcl file that generates backend configs, passes variables, and manages dependencies between modules:
# environments/staging/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
inputs = {
environment = "staging"
instance_class = "db.t3.micro"
}
Terragrunt is well-designed and solves the problems it targets. The tradeoff is that your team now needs to understand both Terraform and Terragrunt, debug issues across both layers, and maintain compatibility when either tool updates. For platform teams with dedicated infrastructure engineers, this works. For product teams where developers own their infrastructure, it adds another tool to the stack that nobody on the team fully understands after the original author moves on.
AWS CDK and SST take a different approach by letting you define infrastructure in TypeScript or Python instead of HCL. Environments become function parameters or class properties, and you can use the language's own features for abstraction:
const stage = app.stage; // "dev", "staging", "production"
const db = new rds.DatabaseInstance(this, "Database", {
instanceType: stage === "production"
? ec2.InstanceType.of(ec2.InstanceClass.R6G, ec2.InstanceSize.LARGE)
: ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.MICRO),
});
This eliminates the HCL-plus-tfvars layering problem. Environments are just code, with all the refactoring and type-checking that implies. CDK still synthesizes to CloudFormation under the hood, so you're working with CloudFormation's deployment model, including its own state management and rollback behavior. SST (v3+) moved away from CDK to use Pulumi and Terraform as its deployment engine, with its own development workflow including live Lambda reloading and a console.
Both tools still require you to define the per-environment infrastructure variations somewhere. The conditionals are in TypeScript instead of HCL ternaries, but the conceptual work of "what should differ between staging and production" remains your responsibility.
All of these approaches share an assumption: that you need to describe the per-environment differences in infrastructure configuration. Different instance sizes, different replica counts, different domain names, different backend configs.
Encore starts from a different premise. Your application code declares the infrastructure it needs, and environments are created in the platform without any per-environment configuration files. You define a database in your code:
import { SQLDatabase } from "encore.dev/storage/sqldb";
const db = new SQLDatabase("users", {
migrations: "./migrations",
});
That declaration applies to every environment. When you create a new environment in the Encore Cloud dashboard, it provisions a complete, isolated copy of whatever your application needs: databases, pub/sub topics, caches, cron jobs. Staging gets its own RDS instance, its own VPC, its own ECS cluster. Production gets the same, sized according to the environment configuration you set in the dashboard.
Preview environments spin up automatically for each pull request, giving you a full copy of your application's infrastructure for testing. When the PR merges, the preview environment is torn down. There's no workspace switching, directory duplication, or tfvars file to maintain.
Promoting between environments is a deployment operation, not a configuration operation. You push to a branch mapped to staging, test it, then deploy to production. The infrastructure in both environments was derived from the same application code, so there's no configuration drift between them.
The choice depends on what your team looks like and what you're building. If you have a platform engineering team that specializes in Terraform, Terragrunt gives you the DRY environment management you need without leaving the ecosystem. If your developers work in TypeScript and own their own infrastructure, CDK or SST remove the HCL learning curve.
If the goal is to stop treating environments as an infrastructure configuration problem entirely, an infrastructure-from-code approach removes the per-environment plumbing. Your application code is the single source of truth, and environments are an operational concern handled by the platform.
The Terraform landscape is shifting. Whether you stay in the Terraform ecosystem or move beyond it, the multi-environment problem is worth solving intentionally rather than letting it grow organically through directory duplication and naming conventions.