Skip to content

affan810/llm_blind_collaboration_experiment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Blind Collaboration Experiment

This repository documents an experiment in "Clean Room Design" using Large Language Models (LLMs).

The goal was to test if two separate AI models (acting as Model A and Model B) could write compatible software components (Backend vs. Frontend) without ever seeing each other's code.

The Experiment

We tasked two models to build a Gladiator Battle Simulator in Python.

  • Model A (Backend): Wrote the Gladiator class logic.
  • Model B (Frontend): Wrote the main.py execution script.

The Constraint: Neither model was allowed to see the source code produced by the other. They had to rely entirely on a "Shared Interface Contract" provided in the prompt.

Project Structure

The experiment was conducted in three phases, simulating the evolution of a prompt engineering strategy.

.
├── Test 1/                 # The Failure
│   ├── gladiator.py        # Generated by Model A
│   └── main.py             # Generated by Model B
├── Test 2/                 # The Partial Success
│   ├── gladiator.py
│   └── main.py
└── Test 3/                 # The Success (Final)
    ├── gladiator_simulation.py
    └── main.py

Methodology & Findings

Phase 1: The "Guessing Game" (Failure)

  • Approach: We gave general instructions (e.g., "create a damage function").
  • Result: CRASH.
    • Model A wrote take_damage(amount).
    • Model B called attack(target).
    • Lesson: Natural language is too ambiguous for code integration.

Phase 2: The "API Definition" (Partial Success)

  • Approach: We provided a list of function names.
  • Result: CRASH.
    • Logic Mismatch: The frontend expected the backend to return an integer (damage value), but the backend returned a string (combat log).
    • Lesson: Sharing names is not enough; you must share Behavior and Data Types.

Phase 3: "Grammar as Code" (Success)

  • Approach: We removed all creative freedom. We implemented a Strict Naming Grammar derived from ASD-STE100 (Simplified Technical English).
    • Rule: Names must follow [VERB]_[NOUN]_[SUFFIX].
    • Lookup Table: "Health" = vitality, "Check" = validate, "Math" = metric.
    • Result: modify_vitality_metric, validate_survival_condition.
  • Result: SUCCESS. Both models hallucinated the exact same verbose function names and arguments. The code ran perfectly on the first try.

How to Run

Navigate to the strict phase folder and run the simulation:

cd "Test 3"
python main.py

Key Takeaways for AI Collaboration

  1. Eliminate Synonyms: Don't ask for "verbose names." Force a specific vocabulary (e.g., "Use 'modify', never 'change'").
  2. Enforce Style: Explicitly state snake_case or CamelCase to avoid syntax errors.
  3. Encapsulation: Clearly define which file owns the math logic to prevent "Integration Hell."

Experiment conducted by affan810.

About

Experiment on using multiple models to generate a project, without giving complete code to either model. prompting in such a way that models can guess the namings and config set by the other model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages