A collection of experiments exploring memory, catastrophic forgetting, and temporal modularity in neural networks.
Author: Vitali Sialedchyk
Modern AI systems exist in "instantaneous time" β optimizing only for the current data batch. This project implements the Stability-First hypothesis:
Time in an AI system is defined by structural inertia. By treating weight stability as "System Time", we can prevent catastrophic forgetting and achieve modular, reversible learning.
| # | Project | Focus | Key Insight | Status |
|---|---|---|---|---|
| 01 | Active Sleep (MNIST) | Generative Replay | Memory can be restored using VAE "dreams" without storing real data. | β Complete |
| 02 | Temporal LoRA (GPT-2) | LLM Scaling | Main success: The "Time Mixer" router dynamically switches between knowledge epochs (Shakespeare vs Python) with 100% accuracy. | π Hero |
| 03 | Stability-First Basic | Foundation | Preventing forgetting by protecting the backbone while maintaining interface plasticity. | β Complete |
| 04 | Reversibility | Lazarus Effect | Memory is often latent, not erased. We recovered "forgotten" tasks from 0% to 94.65% accuracy. | β Complete |
| 05 | Full Suite | Benchmarking | Comparative analysis of 5 strategies (Fractal Time, Adaptive Pain, Dream Replay). | β Complete |
| 06 | Subjective Time | Metacognition | Novel: A system with a "Critic" that automatically regulates its plasticity based on "surprise" (Surprise). | β Complete |
| 07 | Stability-First (CIFAR-10) | Lazarus Project | Breakthrough: Data-free model recovery (93.9% recovery after damage, 85.3% after 80% pruning). | π New |
| 08 | Stability-First (ImageNet) | Large-Scale | Testing Stability-First on ImageNet/CIFAR-100 with ResNet backbone. | β New |
| 10 | Recursive Time Depth | Subjective Time | Novel: Subjective "time" measured by depth of stable recursive transformations. p90/p99 percentile-based stopping, 5-7x speedup, CKA ~0.98 with growing amplitude. | β Ready |
| 11 | Temporal LoRA (Large Models) | LLM Scaling | Confirmed on 7B: All TemporalLoRA theories validated on Mistral-7B-Instruct. Hysteresis (switch-lag: 9 tokens), Deep Crystallization (r=0.8644), 100% router accuracy. Results match and strengthen GPT-2 findings. | β Complete |
Revolutionary discovery: Neural networks can recover from damage without training data using "Architectural Immunity".
- V-Shape Recovery: Restored 93.9% of accuracy lost to noise damage using only random noise inputs.
- Surgical Pruning: Recovered 85.3% of accuracy lost after removing 80% of weights (5Γ compression).
- Frozen Mask > Regrowth: We proved that maintaining the "skeleton" (sparse topology) is more effective than trying to regrow connections with noise.
- Zero Data: No original images were used. The model uses its own structure as a filter to reject chaos.
Full documentation & Graphs: 07-stability-first-cifar10/docs/LAZARUS_FINAL_MANIFESTO.md
V-shape recovery pattern for weight noise damage
Pruning curve comparison: Frozen Mask vs Regrow
Novel contribution: Subjective "time" in neural networks measured by depth of stable recursive transformations.
- p90/p99 Percentile-Based Stopping: Convergence detected via internal activation stability (percentiles of relative change norm) rather than output entropy.
- Attractor-Entry Effect: Recursion drives activations into a stable representational regime without additional training data.
- Time as Order Parameter: Recursion depth required for stability provides an operational internal "time" measure; high CKA (~0.98) supports stabilization consistent with attractor-like dynamics.
- Efficiency: 5-7x speedup over self-consistency at comparable compute budget.
- Condensation Without Degradation: CKA ~0.98 with growing amplitude (||h_t||: 1322 β 12291) indicates stable representational regime (non-collapse stability).
Full documentation: 10-recursive-time-depth/README.md
If you want to run just one experiment, choose Temporal LoRA. It demonstrates dynamic context switching in GPT-2.
# 1. Install dependencies
pip install -r requirements.txt
# 2. Run GPT-2 experiment
cd 02-temporal-lora-gpt2
python temporal_lora.pyWatch as the model automatically learns to route "To code or not to code" to the Shakespeare adapter, and "import torch" to the Python adapter.
We proved that even when model accuracy on Task A drops to 0.00% after training on Task B, knowledge remains encoded in the backbone.
Recovery: 94.65% accuracy recovered with just 50 examples.
In our Temporal LoRA experiment, the gating network successfully learned to distinguish semantic epochs.
Router accuracy: 100.0% after contrastive calibration.
We demonstrated that Frozen Mask stability optimization allows for massive compression without retraining.
Result: +1.62% accuracy gain on an 80% pruned model using the Lazarus Protocol.
D:\new\
βββ README.md # This file
βββ requirements.txt # Common dependencies
β
βββ 07-stability-first-cifar10/ # π The Lazarus Project
β βββ experiments/
β β βββ noise/
β β β βββ experiment_cifar10.py # V-Shape Recovery
β β β βββ experiment_analysis.py # Recovery Curve Analysis
β β β βββ experiment_statistical_significance.py
β β βββ pruning/
β β βββ experiment_pruning.py # Pruning Recovery
β β βββ experiment_pruning_curve.py # Pruning Curve
β βββ docs/
β β βββ LAZARUS_FINAL_MANIFESTO.md # Full Scientific Report
β β βββ LAZARUS_MANIFESTO.md # Complete Documentation
β β βββ RESULTS_VISUALIZATION.md # Visualizations
β βββ results/
β β βββ lazarus_recovery_curve.png # Visual Proof
β β βββ pruning_curve_comparison.png # Frozen vs Regrow Chart
β βββ README.md
β
βββ 02-temporal-lora-gpt2/ # π Temporal LoRA (GPT-2)
β βββ temporal_lora.py
β βββ README.md
β
βββ 11-temporal-lora-large-model/ # β
Temporal LoRA (Mistral-7B)
β βββ run_full_suite.py
β βββ temporal_lora.py
β βββ test_hysteresis.py
β βββ test_fatigue.py
β βββ RESULTS.md
β βββ results/
β
βββ 06-subjective-time-critic/ # Metacognition
β βββ demo_6_subjective_time.py
β βββ README.md
β
βββ 10-recursive-time-depth/ # β³ Recursive Time Depth
β βββ recursive_time_depth.py # Main experiment
β βββ strict_validation_tests.py # 5 validation tests
β βββ TEST_RESULTS_FINAL.md # Final results
β βββ README.md
β
βββ docs/ # Documentation
βββ RESULTS_SUMMARY.md # Final report
β
num_workers=0, pin_memory=False in DataLoader
β
Unicode symbols (Ξ, Ξ») replaced with ASCII
β
All scripts have if __name__ == "__main__"
torchtorchvisionnumpytransformers(for project 2)matplotlib
If you find this research useful, please use the following citation:
Published Paper:
@misc{sialedchyk2026stability,
author = {Sialedchyk, Vitali},
title = {Stability-First AI: Completed Experimental Studies and the Physics of Learning Time},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.18148080},
url = {https://doi.org/10.5281/zenodo.18148080}
}Repository:
@misc{stability_first_ai,
author = {Vitali Sialedchyk},
title = {Stability-First AI: Memory and Recursive Stability as System Time},
year = {2026},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/vitali-sialedchyk/stability-first-ai}}
}This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
β
Free for: Academic research, education, personal testing, and non-profit use.
β Not allowed: Commercial products, paid services, or corporate R&D without a separate agreement.
We offer commercial licensing options including support and architectural consulting.
π© Contact: vitali@agdgroup.pl or via GitHub Issues.
See the LICENSE file for full terms and conditions.
Last updated: January 2026