Skip to content

skip_child_token and other boolean ProviderConfig fields ignored in v3.0.x (no-fork mode incompatibility with terraform-provider-vault v5.x) #122

@Ata-NovacekJan

Description

@Ata-NovacekJan

What happened?

After upgrading provider-vault from v2.2.2 to v3.0.4, the skip_child_token: true setting in ProviderConfig is silently ignored. All Vault managed resources (e.g., Mount) fail to reconcile with:

observe failed: failed to observe the resource: [{0 failed to create limited child token:
Error making API request. URL: POST http://<vault-addr>/v1/auth/token/create
Code: 400. Errors: * batch tokens cannot create more tokens []}]

Expected: The provider should respect skip_child_token: true and use the provided token directly, without attempting to create a child token. This worked correctly in v2.2.2.

Actual: The provider ignores skip_child_token: true and always attempts to create a child token. When the auth token is a batch token (which cannot create child tokens), all operations fail.

How can we reproduce it?

  1. Deploy provider-vault v3.0.4
  2. Configure a Vault auth method that issues batch tokens (e.g., Kubernetes auth with token_type=batch, or a static batch token in a Secret)
  3. Create a ProviderConfig with skip_child_token: true:
    apiVersion: vault.upbound.io/v1beta1
    kind: ProviderConfig
    metadata:
      name: example
    spec:
      address: "http://vault:8200"
      skip_child_token: true
      credentials:
        source: Secret
        secretRef:
          key: providerAuth
          name: vault-crossplane-provider-auth
          namespace: default
  4. Create any managed resource referencing this ProviderConfig (e.g., a Mount)
  5. Observe the reconciliation error: batch tokens cannot create more tokens
  6. Downgrade to provider-vault v2.2.2 with the same ProviderConfig — it works correctly

What environment did it happen in?

  • provider-vault: v3.0.4 (probably also affects v3.0.0 – v3.0.3)
  • Underlying terraform-provider-vault: v5.2.1 (via upbound fork v5.2.1-upjet.1)
  • Previously working: provider-vault v2.2.2 (terraform-provider-vault at commit 0318b6b4523e)
  • Vault server: 1.18.3
  • Crossplane: v2.1.4
  • Kubernetes: managed (AKS/EKS)

Possible root cause

Warning - Frankly, I helped myself with AI search assistant who's faster than me in crawling through unknown repositories, so maybe this will provide some leads, but could be wrong.

provider-vault v3.0.x upgraded the underlying terraform-provider-vault from a v4.x commit to v5.2.1. This version changed how boolean provider config fields are read in internal/provider/meta.go.

In v2.2.2 (terraform-provider-vault at commit 0318b6b4523e), skip_child_token was read via a simple d.Get():

// meta.go:331
if !d.Get(consts.FieldSkipChildToken).(bool) {
    token, err = createChildToken(d, client, tokenNamespace)
}

In v3.0.4 (terraform-provider-vault v5.2.1), it is read via GetResourceDataBool:

// meta.go:337
skipChildToken := GetResourceDataBool(d, consts.FieldSkipChildToken, consts.EnvVarSkipChildToken, false)

GetResourceDataBool uses d.GetRawConfig() to distinguish "unset" from "explicitly false":

func GetResourceDataBool(d *schema.ResourceData, field, env string, dv bool) bool {
    rawConfig := d.GetRawConfig()
    if rawConfig.IsNull() {
        return dv  // always returns false in Crossplane no-fork mode!
    }
    // env var check and d.Get() are never reached
}

The code itself comments: "RawConfig will only be available during a terraform plan/apply execution". In Crossplane's upjet no-fork mode, the provider is configured via schema.Provider.Configure() with a tfsdk.ResourceConfig, not through a Terraform plan/apply cycle. GetRawConfig() always returns a null cty.Value, so the function returns the default value — regardless of what the ProviderConfig specifies.

The environment variable fallback (TERRAFORM_VAULT_SKIP_CHILD_TOKEN) also does not work because it is only checked when rawConfig is non-null but the specific field is null.

Note that GetResourceDataStr and GetResourceDataInt still use d.Get() directly and are not affected. All boolean fields read via GetResourceDataBool are affected:

Field Default Impact
skip_child_token false Always creates child token (breaks batch token auth)
skip_tls_verify false Cannot enable insecure TLS
skip_get_vault_version false Always queries /sys/seal-status
set_namespace_from_token true Accidentally works (default is true)

Suggested fix: In the upbound fork of terraform-provider-vault (v5.2.1-upjet.1), patch GetResourceDataBool to fall back to d.Get() when rawConfig is null:

func GetResourceDataBool(d *schema.ResourceData, field, env string, dv bool) bool {
    rawConfig := d.GetRawConfig()
    if rawConfig.IsNull() {
-       return dv
+       if v, ok := d.Get(field).(bool); ok {
+           return v
+       }
+       return dv
    }
    // ... rest unchanged
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions