-
Notifications
You must be signed in to change notification settings - Fork 1.8k
How Terraform (sort of) works
This document is written for provider developers on TPG. It's both incomplete and inaccurate. However, it's a useful model for explaining interactions between Terraform Core and the Terraform Plugin SDK. Particularly, the Resource Instance Change Lifecycle document is a great summary of how the Terraform binary ("Core") understands interactions (most of the state terms are drawn from there), but doesn't map well to the SDK's provider framework that we use.
This model will freely cross between Core and the SDK- the goal is to map what users see to what developers write, not to explain the protocol, Core, or SDK accurately.
Terraform users will generally use five commands to interact with Terraform: terraform apply, terraform plan, terraform import, terraform refresh, and terraform destroy. In addition, they'll have a statefile in their current directory (terraform.tfstate generally) and 1+ config files like main.tf. Prior to any command, Terraform will perform a validation step where it calls ValidateFuncs directly on the raw values written in a user's config. If values are "unknown", such as values drawn from an interpolation, they're not validated.
Each of those 5 user-facing commands are roughly made up of some combination of two actions- "refresh" and "apply". A refresh is when Terraform finds the current state of a resource from the API and keeps track of it as the "prior state". It knows what resource and region to use based on the statefile. For most providers it only needs the id, which will contain a unique identifier, however in TPG we tend to draw directly from fields like project and name. An apply is when Terraform performs the appropriate actions to bring the resource from it's prior state to a new state, "planned new state". The real state at the end of an apply is called a "new state".
We can model those 5 commands with apply and refresh like the following:
-
terraform refreshperforms a refresh, writing the prior state into the statefile, replacing the old contents -
terraform importwrites the supplied id into the statefile and then performs a refresh, writing the prior state into the statefile -
terraform planperforms a refresh, and compares the prior state to the "proposed new state" to create the planned new state. It displays the difference between the prior state and planned new state to the user -
terraform applyimplicitly performsterraform plan. If the user approves the change, it performs an apply. -
terraform destroyis a convenient way to callterraform applywhere the "proposed new state" is empty (indicating that the resource should be deleted)
It's clear how Terraform uses the statefile- it's roughly a serialized state. However, the user's config isn't consumed directly. Instead, Terraform uses that to build the proposed new state. To do so, Terraform copies config values directly into the proposed state and copies Computed values from the prior state (when not present there, they get the special value "unknown"). Any other values are assumed to have the corresponding zero value for their type ("", 0, false, etc.) Optional+Computed values are treated like a Computed value when unset, and a normal value when set.
It's also worth noting that any time Terraform creates a state, it will run StateFunc functions on each field if there are any, allowing the state to be modified. They allow us to modify values in the state, but they only have access to that single field's value. In practice we don't use them much.
During Apply, Terraform assumes that the planned new state and new state are identical by default. The ResourceData d has a few different meanings depending on the CRUD method.
-
In Create,
d.Getdraws from the planned new state andd.Setsets the new state -
In Delete,
d.Getdraws from the prior state -
In Update,
d.Getdraws from the planned new state,d.GetChangefrom (prior state, planned new state).d.Setsets the new state
Otherwise, Apply isn't very exciting- Terraform calls the appropriate provider methods.
During terraform plan, the planned new state is created and then diffed against the prior state to show a diff to the user. During an apply, the planned new state is the desired state for the resource to reach.
During terraform plan, the planned new state is created by:
- Filling in unset values in the proposed new state that have a
Defaultwith that value. - Running
CustomizeDiff - Running
DiffSuppressFuncs (DSFs)
DSFs were added to the provider SDK before CD. They're very constrained, and can only see the value of the current field (or subfields if that field is a block). They can return true to indicate that both values for the field are identical. If they are, Terraform discards the proposed new state's value, and replaces it with the prior state's value.
CustomizeDiff is much more flexible. The ResourceDiff diff is available, which is roughly a superset of d. In addition to d's ability to read a whole resource state, diff can modify the planned new state, diff return errors, and clear the diffs on fields like a DSF. In theory, ValidateFunc DiffSuppressFunc, Default could all be implemented in terms of CustomizeDiff.
As highlighted above, you've got a number of tools to modify a user's config and make it useful. To summarise them again, they're:
-
ValidateFuncs allow you to reject configs based on a single field being invalid -
Defaultvalues which fill in unset values duringterraform plandiffs andterraform apply -
StateFuncs let you canonicalise values when states are created (but TPG doesn't use them much) -
DiffSuppressFuncs let you tell Terraform to keep the value from the prior state if it's identical to the one in config -
Optional+Computedfields tell Terraform to handle them as if they're output-only when unset, and configurable when set
Finally, CustomizeDiff is incredibly powerful, effectively allowing the provider to perform arbitrary transformations. It's somewhat dangerous to use, as it's very easy to make a transformation that Terraform Core will reject. Based on that, other more focused tools should be preferred. These are some cases where CustomizeDiff can solve otherwise unsolvable problems:
-
Conditionally setting fields as
ForceNewto indicate the resource should be recreated. For example, allowing disks to size up but not down. -
Adding custom error messages.
- For example, App Engine applications can't be moved once created. Instead of an erroneous
ForceNew, TPG returns an error if a user attempts to move one. - Complex validations. For example, asserting that one value must not be greater than another or that if one field is set, another must be.
- For example, App Engine applications can't be moved once created. Instead of an erroneous
-
Adding conditional defaults based on the value of another field.
-
Rewriting the planned new state for a value. For example, reordering a list to match the prior state when possible.
The split between Core / SDK is still somewhat new, and we're in the middle of growing pains. Terraform 0.12 included a major overhaul of Core according to their holistic view of how providers and the SDK should work, even when that view went against SDK convention or impossible to fulfill. For example, the Resource Instance Change Lifecycle page lists many assertions that values stay the same between states that providers today do not fulfill. Today they're all warnings, but Core's assertions are intended to become errors in the future.
One particularly amusing example is Default values. As implemented by the SDK, they cause the following error: - .port: planned value cty.NumberIntVal(80) does not match config value cty.NullVal(cty.Number).
Another is Optional + Computed. It was never intended to work quite the way it does, especially when used with TypeList and TypeSet. However, there's no viable alternative.
For TPG, the issue Fit our state-setting model to the protocol is the largest divide between the providers / SDK (and the model presented here) and Core.