Skip to content

Autonomous AI agent with neuro-chemical RL (dopamine/serotonin), dual-process reasoning, and meta-learning for AGI research.

Notifications You must be signed in to change notification settings

sunghunkwag/Advanced-AI-Meta-Cognition-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Advanced-AI-Meta-Cognition-System

Research Note: This is an experimental simulation probing whether neuro-chemical modulation (Dopamine, Serotonin, Cortisol) can provide useful inductive biases for RL agents, or if it merely acts as stochastic noise.

Neuro-Chemical Reinforcement Learning Simulation

Python 3.10+ License: MIT

This repository serves as a testbed for a "dual-process" cognitive architecture. It simulates internal hormonal dynamics to drive agent behavior, attempting to model intrinsic motivation and homeostasis rather than purely extrinsic reward maximization.


Hypothesis

Can an agent driven by internal homeostatic regulation (balancing "boredom" and "satisfaction") effectively explore and learn in sparse-reward environments better than standard epsilon-greedy or entropy-regularized baselines?

Current Status: The system demonstrates distinct behavioral modes (exploration vs. exploitation) driven by simulated hormones, but it remains unproven whether this complexity yields a statistically significant advantage over standard meta-learning approaches on general tasks.


Core Architecture (Experimental)

The system modularizes "cognition" into four interacting components. Note that the anthropomorphic naming conventions (Soul, Heart, etc.) are internal metaphors for the code modules, not claims of biological fidelity.

1. Action Decoder (action_decoder.py)

  • Function: Maps latent states to continuous action parameters.
  • Mechanism: Dual-head neural network outputting logits (discrete action type) and parameters (continuous x, y, scale).
  • Goal: Test end-to-end learning of dexterity without pre-defined action templates.

2. Graph Attention Manifold (manifold.py)

  • Function: Relational reasoning.
  • Mechanism: GAT (Graph Attention Network) processing object-oriented representations of the visual input.
  • Goal: Infer causal relationships between objects to inform decision-making.

3. Neuro-Chemical Engine (energy.py)

  • Function: Intrinsic reward shaping.
  • Mechanism: Simulated hormone levels acting as dynamic hyperparameters.
    • Dopamine: Correlates with prediction error/surprise. Modulates learning rate and exploration.
    • Serotonin: Correlates with stability/low-energy states. Modulates "satisfaction" (stopping criteria).
    • Cortisol: Correlates with stagnation or high entropy. Increases randomness/escape behavior.

4. Elastic Weight Consolidation (automata.py)

  • Function: catastrophic forgetting mitigation.
  • Mechanism: Triggered when "Serotonin" thresholds are met (representing a stable, solved state), locking weights to preserve current capabilities.

Experimental Observations

We evaluated configurations on a mini-ARC task suite.

Ablation Study (N=10 seeds):

Config Description Energy Reduction Notes
Baseline Random policy 0% Reference
System-1 Energy minimization ~13% Heuristic driven
System-1+2 With planner rollouts ~26% Classical planning helps significantly
Full System With meta-learner ~30% Marginal gain over Planner, higher variance

Observation: While the full system achieves the lowest energy state, the "Meta-Learning" component (System 3) adds significant complexity for a relatively small marginal gain (approx. 5%) over the standard Planner (System 2). This suggests the neuro-chemical modulation might be over-parameterized.


Limitations & Failure Modes

  1. Complexity Overhead: The interaction between three hormone signals creates a chaotic internal state space that is difficult to tune. The agent often oscillates between "panic" (high Cortisol) and "apathy" (low Dopamine) without finding a stable learning groove.
  2. Anthropomorphic Bias: The architecture assumes that biological metaphors (like "boredom") map cleanly to mathematical optimization. This assumption is strong and often leads to opaque failure modes where the agent "refuses" to act due to internal state rather than environmental constraints.
  3. Scalability: The Graph Attention Manifold (manifold.py) scales quadratically with the number of visual objects, making it slow for complex scenes.

Installation & Running

Prerequisites

pip install torch numpy pytest pytest-cov matplotlib

Execution

Run the basic simulation:

python main_system.py

Run deterministic experiment harness:

python experiments/run_experiment.py --steps 50 --seed 1

Citation

If you use this code for research into intrinsic motivation or bio-inspired RL, please cite:

@software{kwag2024advanced,
  author = {Kwag, Sung Hun},
  title = {Advanced AI Meta-Cognition System: Experimental Neuro-Chemical RL},
  year = {2024},
  url = {https://github.com/sunghunkwag/Advanced-AI-Meta-Cognition-System}
}

License

MIT License

About

Autonomous AI agent with neuro-chemical RL (dopamine/serotonin), dual-process reasoning, and meta-learning for AGI research.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages