Reducing Terraform code duplication

Brendan Thompson • 2 September 2020 • 4 min read

One thing I find that happens a lot with Infrastructure as Code, especially with Terraform is that a lot of the code is repeated. Whether that be through duplication of calls to modules, or naked resources. What I mean by duplication in this instance is as follows; say we have three environments each of these environments requires an identical set of cloud resources, one practice would be to have three instances of this terraform code ( either existing in their own repository or directory ) then input variables are supplied to each of these instances. This means, that we have the same, or nearly the same code written - or worse, copy pasted - three times.

Terraform as a product offers some solutions to this, either via the use of different tfvars files for each set of variables that can then be passed into a single set of code, or the use of workspaces in either TFOSS or TFE/TFC. A method I have started employing is the use of a yaml file, this allows for the creation of a schema that will be identical for all environments within the Organisation, this system also allows for the use of overrides.

An example of such a config is as per the below:

  zone: "us-west1-b"

    zone: "us-west2-a"
      image: "debian-cloud/debian-10"
      machine_type: "e2-micro"
      name: "default"
      image: "debian-cloud/debian-10"
      machine_type: "e2-standard-8"

As can be seen in the above, we have two primary sections global and environment, the former holds information that is applicable across anything that the Terraform code may be doing whereas the latter contains configuration information about a specific environment. One thing to note here, is that we have zone defined in our global section however, we also have this defined in out dev environment, this is essentially how we will be enabling an override of a default value. The below code shows the corresponding Terraform code that will consume our config.yaml:
locals {
  raw_config = yamldecode(file(format("%s/config.yaml", path.module)))
  config = {
    global =
    environment = lookup(local.raw_config.environments, var.environment, "err: invalid environment")

variable "environment" {
  type = string
  description = "(Optional) Name of the environment to deploy into"
  default = "dev"

resource "google_compute_instance" "this" {
  name = format("iaas-%s", var.environment)
machine_type = local.config.environment.iaas.machine_type
zone = != null ? : boot_disk { initialize_params { image = local.config.environment.iaas.image } } network_interface { network = } service_account { scopes = ["userinfo-email", "compute-ro", "storage-ro"] } }

We can clearly see here that a GCP IaaS machine is being provisioned, we have an input variable for environment defined this will be used to orientate Terraform within the config file. The locals{} declaration has a callout to ingest the config.yaml file and then do some lookups, the second lookup performed is on that environment input variable. As can be seen on line 17 when the machine_type is required we use the machine_type property from our yaml file. If this same terraform code was executed with the environment input variable set to prod there would be no requirement for the user to change the code in any way to allow for different settings on the production instance, this is something very powerful in our fight to reduce code duplication. There are times where we might want to have a default value defined at a global, or higher up level

zone = != null ? :

This is essentially saying, if the value for is null then use the value at this simple technique is something that is extremely useful in both the yaml config solution as well as general Terraform.

Thus, when this code is invoked it can be invoked by any type of Terraform - TFC, TFE, TFOSS - the only thing that needs to be passed in is something to orientate the Terraform code within the yaml file. An example of this would be as per below:

terraform apply -var=”environment=prod”

This one line lets Terraform know where it needs to lookup within the yaml file. The same can be done within TFC/TFE by setting the input variable within the given workspace.

Hopefully this provides some insight into a different way of dealing with configuration for Terraform whilst also reducing the amount of code duplication that is prevalent in a lot of Terraform codebases.

Brendan Thompson

Principal Cloud Engineer

Discuss on Twitter