Skip to content

Support for nested compositions and plugin cache locking issues #120

@repl-chris

Description

@repl-chris

I'm working with nested compositions using provider-opentofu. Specifically, I have an outer composition that creates a claim resource. The outer composition waits for this claim to become ready, then reads certain outputs from it for use in other dependent resources. The inner claim itself also uses an opentofu-based composition.

When running a single instance of the controller, this setup times out - the outer composition waits for the inner claim to be ready, but the reconciliation thread can't process the inner workspace since it's still busy executing the outer workspace (which is, in turn, waiting for the inner one).

I've looked at the --max-reconcile-rate setting, which appears designed to help with cases like this, but increasing it doesn't resolve the issue - it still times out waiting on the inner claim. My suspicion is this relates to locking around the shared plugin cache...while the outer workspace is running tofu apply it holds a reader lock, which blocks the inner workspace from acquiring a writer lock (needed for its own tofu init) and, thus, stalls progress.

Disabling the shared cache seems like a decent potential option, but the documentation (Configuration.md) strongly warns about significant increases in memory consumption. My assumption is that with the shared cache disabled, each tofu process loads (and releases) its own plugins. Is the warning solely about temporary increases in memory when many tofu processes run in parallel, each loading their own copies of the libraries (unable to use the OS-level shared library mapping/cache)?...or is there another source of bloat or shared-memory accumulation I should be aware of when running this way?

Or, we could take a completely alternate direction - we could run multiple controller replicas, each reconciling workspaces independently and relying on terraform's state file locks. This doesn't seem advisable - I haven't found any references to anyone running that way, and it seems like frequent/expected lock contention would likely lead to resources being marked unhealthy by the provider.

I’d appreciate any insight, especially around:

  • whether disabling plugin cache is safe/practical in practice for this "nested composition" use-case,
  • whether there are other recommended approaches for handling this kind of parent/child composition relationship,
  • and any further insight on the nature of plugin/provider memory management in this controller.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions