Skip to content

Observation: Deterministic structured runtime behavior inside ChatGPT (FRR demo + reproducible pipeline) #1397

@yuer-dsl

Description

@yuer-dsl

Hi Guidance team,

I’ve been running a series of experiments to test whether a general-purpose LLM can behave like a deterministic structured runtime when placed under strong constraints — without any external tools, APIs, plugins, or execution frameworks.

Surprisingly, the results were stable and fully reproducible, so I'm sharing them here in case they are relevant to your work on structured prompting and constrained generation.

🚀 Summary of the experiment

I built a miniature Flight Readiness Review (FRR) Runtime that forces ChatGPT into an 8-step deterministic pipeline:

Input parsing

Normalization

Factor engine (F1–F12)

Global RiskMode

Subsystem evaluation

KernelBus arbitration

Counterfactual reasoning

A strict FRR_Result block (no free-form output allowed)

Key property:

Same input → same output
(zero drift, zero narrative expansion)

This emergent deterministic behavior is what caught my attention.

📡 Reproducible Test Scenarios

I ran the runtime against several historical-style telemetry snapshots (e.g., cold O-rings, COPV instability, wind-shear cases).

Even though these are not aerospace simulations, the behavior was consistently deterministic:

Stable factor vectors

Stable subsystem arbitration

Stable final decision

No deviation across runs

This reminded me of the constraints & patterns that Guidance tries to formalize.

🎥 Demo Video (3 minutes)

A short screen recording of the deterministic FRR runtime running inside the ChatGPT client:

https://youtu.be/9R6wc-LVzSc

📦 GitHub Repo (safe, prompt-only)

Spec + soft-system prompt + sample telemetry inputs:

https://github.com/yuer-dsl/qtx-frr-runtime

🔍 Why post this here

This is not a feature request — only an observation:

Strong structural constraints appear to induce deterministic, pipeline-like execution behavior inside an LLM, even without tools.

Given Guidance’s focus on:

structured output

constrained LLM execution

reproducible reasoning

multi-step control flows

I thought this phenomenon might be of interest for future discussions or evaluation benchmarks.

Happy to provide simplified test cases or a reduced prompt if helpful.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions