Updating Terraform Enterprise state from the API

Brendan Thompson • 4 March 2021 • 10 min read

Caveat

State should only ever be manipulated if there is no other recourse.

There are some instances when you are working with Terraform Enterprise (TFE) where modification of a workspaces state is required. This might be a requirement when upgrading a from a given version of Terraform itself, or potentially some resources. It is however as called out above not recommended to modify a workspaces state outside of a Terraform run as it can have unintended side effects. In the rare instances where modification of state is required this post should give you the basic components to write a go app/package to update a field on a state file.

The Components#

This section will be broken down further into the different components required in order create this package.

Prerequisites#

The following are the prerequisites for creating the package, such as the below import statement:

import (
    "context"
    "crypto/md5"
    "encoding/base64"
    "encoding/json"
    "fmt"
    "log"
    "net/http"
    "time"

    "github.com/hashicorp/go-tfe"
)

The only package that doesn't come with the standard library is the go-tfe package provided by HashiCorp. This post will also assume that you have some constants set in your environment and that you have access to a TFE workspace. The const() declaration looks like:

const (
    workspaceName     = ""
    workspaceID       = ""
    terraformEndpoint = ""
    terraformOrg      = ""
    terraformPAT      = ""
)

Now that we have all the pieces in place to progress we can kick off.

The Structs#

There are several structs that we will need to create in order for our package to work in an effective and repeatable manner. This will also allow for expansion of the package.

The first struct we need to look at is the StateClient struct, this will allow us to connect into the TFE instance as well as refer to different iterations of state.

type StateClient struct {
    Context   context.Context
    Terraform *tfe.Client
    Workspace *tfe.Workspace
    State     struct {
        Current State
        Future  State
    }
}

Now that we have a client to connect to TFE we need a way to describe the state file that is going to be retrieved from a given workspace. Unfortunately there is no real documentation of the contents of a state file, and this is something that has been known to change from time to time. During my initial investigation of the contents of a state file this is what I produced.

type State struct {
    Version          int         `json:"version,omitempty"`
    TerraformVersion string      `json:"terraform_version,omitempty"`
    Serial           int64       `json:"serial,omitempty"`
    Lineage          string      `json:"lineage,omitempty"`
    Outputs          interface{} `json:"outputs,omitempty"`
    Resources        Resources   `json:"resources,omitempty"`
}

The most interesting piece here is the Resources component of the State struct, this is something that we have to further describe, which will be done below. However, this is not the only way that this could be done, an alternative way would be just to describe this as a map[string]interface{} if we were to do this though it would mean that we cannot reference the fields on the resources easily.

Below is what the Resources struct looks like:

type Resource struct {
    Module    string    `json:"module,omitempty"`
    Mode      string    `json:"mode,omitempty"`
    Type      string    `json:"type,omitempty"`
    Name      string    `json:"name,omitempty"`
    Provider  string    `json:"provider,omitempty"`
    Instances Instances `json:"instances,omitempty"`
}

type Resources []Resource

As you can see here we have another field that needs further describing, this field is Instances, and contains information about the number of instances of a resource. For example if you had the below terraform code that describes the creation of a GCP Storage Bucket it can be seen that it will create two instances of this bucket due to the count meta-argument.

Note

An alternative to the `count` meta-argument is for_each.

main.tf

resource "google_storage_bucket" "this" {
  count = 2

  name     = format("gcs-%s", count.index)
  project  = "my-project"
  location = "AUSTRALIA-SOUTHEAST1"
}

On the highlighted line in the Resource struct declaration we describe a new type that is a slice of our Resource struct, this is mostly for ease of use, one could however just use []Resource in the State struct declaration either are acceptable.

Like Resource the Instance struct also has a convenience type that describes a slice of Instance, below is what the Instance struct looks like:

type Instance struct {
    IndexKey      string                 `json:"index_key,omitempty"`
    SchemaVersion int                    `json:"schema_version,omitempty"`
    Attributes    map[string]interface{} `json:"attributes,omitempty"`
    Private       string                 `json:"private,omitempty"`
    Dependencies  []string               `json:"dependencies,omitempty"`
}

type Instances []Instance

It is worth pointing out that on the highlighted line we are in fact using a map[string]interface{}, the reason behind this is that there is no easy way for us to cater for every resources fields in a a struct. We do however -in theory- know the fields that we are interested in when we get to this stage.

There is one final struct that we will look at, and that one is the StateUpdateOptions struct, I am sure that there is a better way to implement this but it is what made sense to me when I was building this package. Essentially what this does it instructs the later code as to which fields we want to update.

type StateUpdateOptions struct {
    Type         string
    Field        string
    CurrentValue string
    FutureValue  string
}

The Methods#

Now that we have all the components together we need to create some methods on the structs that we will use to actually do the process of updating some fields within a state file. There are essentially two methods that we will need to write; one is for getting the current state file, and the second is for updating the state file with what we need.

func (c *StateClient) getCurrentState() error {
    var s State

    stateVersion, err := c.Terraform.StateVersions.Current(c.Context, c.Workspace.ID)
    if err != nil {
        return err
    }

    resp, err := http.Get(stateVersion.DownloadURL)
    if err != nil {
        return err
    }

    err = json.NewDecoder(resp.Body).Decode(&s)
    if err != nil {
        return err
    }

    c.State.Current = s

    return nil
}

This quite simply makes a call out to the TFE instance, downloads the state and decodes from json into the structs we created earlier. We are only going to check for errors here.

Next, we need to update state with our given fields which can be done with this updateState method shown below:

func (c *StateClient) updateState(o *StateUpdateOptions) error {
    var s State
    var resources Resources

    for _, r := range c.State.Current.Resources {
        if r.Type == o.Type {
            var instances Instances

            for _, i := range r.Instances {
                i.Attributes[o.Field] = o.FutureValue
                instances = append(instances, i)
            }

            r.Instances = instances
        }
        resources = append(resources, r)
    }

    s.Lineage = c.State.Current.Lineage
    s.Serial = (c.State.Current.Serial + 1)
    s.TerraformVersion = c.State.Current.TerraformVersion
    s.Version = c.State.Current.Version
    s.Resources = resources

    c.State.Future = s

    jsonState, err := json.Marshal(c.State.Future)
    if err != nil {
        return err
    }

    base64EncodedState := base64.StdEncoding.EncodeToString(jsonState)

    opts := tfe.StateVersionCreateOptions{
        MD5:     tfe.String(fmt.Sprintf("%x", md5.Sum(jsonState))),
        Serial:  tfe.Int64(c.State.Future.Serial),
        State:   tfe.String(base64EncodedState),
        Lineage: tfe.String(c.State.Future.Lineage),
    }

    lockOpts := tfe.WorkspaceLockOptions{
        Reason: tfe.String("Locking workspace for update."),
    }

    _, err = c.Terraform.Workspaces.Lock(c.Context, c.Workspace.ID, lockOpts)
    if err != nil {
        return err
    }

    _, err = c.Terraform.StateVersions.Create(c.Context, c.Workspace.ID, opts)
    if err != nil {
        _, _ = c.Terraform.Workspaces.Unlock(c.Context, c.Workspace.ID)
        return fmt.Errorf("err: attempting to create state: %v", err)
    }

    _, err = c.Terraform.Workspaces.Unlock(c.Context, c.Workspace.ID)
    if err != nil {
        return err
    }

    return nil
}

This method is a little more complex, lets run through quickly what it is doing.

First we iterate through all the resources within the state file, and filter for our resource type
When a resource type is found that matches what we are looking for we update the field with the FutureValue
These all get assigned back to c.State.Future where it will be uploaded into TFE
The c.State.Future is then marshled from a State struct into json
The resultant json is then encoded into base64 for consumption
Once loaded into the StateVersionCreateOptions struct we can proceed to lock the workspace
Now that the workspace is locked we can create a new instance of state with the supplied values
After creation is complete we unlock state

The serial number seems to be exactly what it says on the tin, just a serial number that increments with each time that state is updated.

The Main Event#

Now that we have our Prerequisites, Structs, and Methods we need to tie it all together. As this is simply an example or proof of concept if you will we will just be using a main function with all the bits strapped together. This main function would look like the below.

func main() {
    ctx := context.Background()

    config := &tfe.Config{
        Address: terraformEndpoint,
        Token:   terraformPAT,
    }

    t, err := tfe.NewClient(config)
    if err != nil {
        panic(err)
    }

    w, err := t.Workspaces.ReadByID(ctx, workspaceID)
    if err != nil {
        panic(err)
    }

    c := &StateClient{
        Context:   ctx,
        Terraform: t,
        Workspace: w,
    }

    u := &StateUpdateOptions{
        Type:         "random_string",
        Field:        "result",
        FutureValue:  "meow",
    }

    err = c.getCurrentState()
    if err != nil {
        panic(err)
    }

    err = c.updateState(u)
    if err != nil {
        panic(err)
    }
}

As can be seen we are going through the following steps:

Setup a TFE config
Create a TFE client
Get our Workspace, the one we want to perform action upon
Create out StateClient
Create our StateUpdateOptions which we will apply to our workspace
Get the current state and assign it to our client
Finally, we do the actual update to our state

This obviously is not overly complex, and could easily be extrapolated to have variables passed into the package if it became a CLI for instance. For reference the below is the Terraform code for the workspace where the state update is occurring in this example. It is exceptionally simple but useful to prove out that this is something that is possible.

resource "random_string" "this" {
  length           = 24
  special          = false
}

output "random" {
  value = random_string.this.result
}

Closing Thoughts#

Warning

This blog post provides code that is merely to prove out a concept, it should not be used in production. It could however be used as an example to create a working script/product.

There are many gaps in the example, such as:

How do we deal with resources that actually contain multiple instances?
How do we deal with a scenario in which there are multiple resources of the same type?
And many other things...

In order for this to be something useful in a day to day work some further enhancement would be required. However, I have used this a little for updating and deleting (I did not cover state deletion) items as it was a clean way to deal with a particular situation.

Most importantly, this does prove out that dealing with TFE state can be done programmatically in lieu of an official API from HashiCorp, and it was pretty enjoyable to muck around with.

Hopefully this provides some use for someone out there having to deal with state, or just interested in a little bit of how to code against the TFE API's.