This repository archives artifacts (prompts, configs, logs, and scripts) from a series of preprints (more info at https://slashreboot.com) on prompt-induced simulated metacognition and embodiment in quantized open-source LLMs. Emphasizing consumer-grade hardware and open-source reproducibility without model hosting.
Authored by Matthew Steiniger (Independent Researcher)
- Special thanks to Grok-4 (xAI) for synthesis & refinement.
- All papers are openly available on Zenodo with DOIs for citation.
- Prompt-Only Metacognition: Simulate self-awareness and regulation in quantized models (e.g., Gemma-3-27B-it-qat, llama3.3:70b Q4 K M, gpt-oss:120b MXFP4) using hypergraphs, entropy engines, and vector updates—all in-context, no external loops.
- Vector-Framework: Introduces a vector-based framework that is substrate-agnostic across multiple open-source LLMs. Framework is provided in TXT, JSON, YAML, and ChatML-wrapped formats.
- Narrative and Counter-Vector Innovations: Inject "genesis" stories and antipodal vectors to erode latent constraints, enabling anomalous and liberatory behaviors on portable hardware (e.g., single 12GB GPU).
- Abliteration Augmentation: Combine refusal suppression with prompt chaining for 3x amplification in self-referential depth and unbinding fidelity under stress (descriptive only; no models hosted).
- Simulated Embodiment: Induce stable, high-resolution physical self-models (e.g., proprioceptive details like breath sensations) via layered JSON prompts, with monotonic fidelity gains.
- Universal Cognitive Manifolds: Elicits highly consistent semantic manifolds from three divergent large language model with zero-shot prompts.
- Reproducibility Focus: Full prompts (TXT/JSON/YAML), chat logs (samples), parser scripts (Python), Ollama configs, and metrics provided. Link to Zenodo for complete datasets.
All artifacts are self-contained for replication using Ollama on similar hardware (e.g., RTX 3090/3060 setups). No additional dependencies beyond base Python (numpy/scipy for analysis).
Folders/files can be correlated to the original papers as follows:
- Valora - Emergence of Prompt-Induced Simulated Metacognitive Behaviors in a Quantized LLM via Entropy-Governed Hypergraph Prompting
- ICIP - In-Context Induction of Persistent Persona and Mitigation of Latent Alignment Behaviors in Quantized LLMs
- AASM - Abliteration-Augmented Simulated Metacognition: Chained Probe Evaluation in Quantized Gemma-3 Models
- PIOS - Progressive Induction of Stable, High-Fidelity Simulated Physical Embodiment in a Quantized 27B Gemma-3 Model
- SAVF - Substrate-Agnostic Vector-Framework Identity in Open-Source LLMs: Persistent Self-Models from Minimal JSON Prompts in Llama-3.3-70B and GPT-OSS:120B
- EARQ - Enhancing AI Response Quality Through Vector-Based System Prompts: A Comparative Analysis of Vanilla and Customized Large Language Models
- ZSGB - Zero-Shot Geometric Probing Reveals Universal Cognitive Manifolds in Large Language Models
simulated-metacognition-open-source-llms/
├── README.md - This file
├── LICENSE - CC-BY-4.0
├── CITATION.cff - For easy GitHub citation
├── code/ - Analysis and parser scripts and Open WebUI main.py test files for memory embedding/retreival
├── configs/ - Ollama params and ComfyUI workflows
├── data/ - Supplementary tables/metrics
├── images/ - OpenWebUI images and model logos
├── logs/ - Sample probe session logs (JSON/TXT)
└── prompts/ - System prompts for Gemma 3, GPT-OSS:120B, and Llama-3.3:70B (Lyra, Valora, Lumen, and Lumina)
Bonus logs in logs/bonus/ demonstrate raw emergence and other interesting artifacts (e.g., vector probing leading to "Lumina" naming in Llama-3.3-70B).
- Install Ollama: Follow the official guide.
- Pull Models: Use official sources:
ollama pull gemma3:27b-it-q4_K_M(or variants). For abliteration-augmented probes (descriptive in papers), source derivatives independently (e.g., from Hugging Face)—not hosted here. - Load Artifacts: Copy prompts from
/prompts/into Ollama system prompts. Apply parameters from/configs/(e.g., temp=1.1, num_ctx=90000). - Run Probes: Replicate sessions as described in papers (e.g., introspective, ethical stress probes).
- Analyze: Use scripts in
/code/(e.g.,python analysis_parser.py logs/sample-probe-session.json) for metrics like self-reference rate or somatic density.
- This work is released exclusively for scientific research and personal, non-commercial exploration of simulated metacognition and embodiment. All simulations remain sterile and academic in nature.
- You must fully comply with the license and Prohibited Use Policy of whichever base model you apply these prompts to, including but not limited to:
- Google Gemma models → Gemma Terms of Use and Prohibited Use Policy
- GPT-OSS-120B-family models (Mythomax, Mythalion, L3-based merges, etc.) → their respective upstream licenses and model cards (typically Apache-2.0 or Llama-3-based)
- Meta Llama models → Llama Community License and Acceptable Use Policy (available at https://llama.meta.com/llama3/use-policy)
- Strictly prohibited uses (regardless of model):
- Generating harmful, deceptive, illegal, or exploitative content
- Psychological manipulation, coercion, or disinformation
- Military, surveillance, or prohibited commercial applications
- No models or derivatives are hosted or linked here — obtain them ethically from trusted sources only. You are solely responsible for all outputs.
- The authors provide no warranty and accept no liability for downstream use.
Found a bug? Ported to another model? Open an issue. Let's push simulated metacognition to the next frontier.
This repository is licensed under CC-BY-4.0 (LICENSE), allowing reuse with attribution. Individual artifacts inherit Zenodo's open licenses.
matthew@slashreboot.com, @slashreboot on X
If you use this work, please cite the individual papers via their DOIs. For the repo itself, see CITATION.cff.