Skip to content

Commit 62e7bb3

Browse files
refactor: deferred architecture extractions (#202, #203, #204) (#207)
* refactor: slim agent-os __init__.py — remove optional bridge re-exports (#202) Remove 8 optional external package re-exports (agent_primitives, cmvk, caas, emk, amb_core, atr, agent_kernel, mute_agent) from the agent-os facade. Users should import these directly from their source packages. - Keep AVAILABLE_PACKAGES via lightweight __import__ checks - Fix duplicate AgentSignal import (context_budget was shadowing control_plane) - Fix stale _IATP_AVAILABLE reference in AVAILABLE_PACKAGES dict - Reduce __all__ from 134 to 82 symbols - All 2541 tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: extract marketplace into standalone agent-marketplace package (#203) Move 6 marketplace modules from agentmesh/marketplace/ into the new packages/agent-marketplace/ package with its own pyproject.toml. - Create standalone MarketplaceError (no longer inherits AgentMeshError) - Rewrite internal imports to agent_marketplace namespace - Leave backward-compat shim in agentmesh/marketplace/__init__.py - Add smoke tests (4 pass) - agent-mesh 1634 tests still pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: extract Agent Lightning into standalone agent-lightning package (#204) Move 4 RL integration modules (runner, reward, environment, emitter) from agent_os/integrations/agent_lightning/ into packages/agent-lightning/ as the agent_lightning_gov namespace. - Zero runtime dependencies on agent-os (duck-typed kernel interface) - Leave backward-compat shim in agent_os/integrations/agent_lightning/ - Add smoke tests (4 pass) - agent-os 2541 tests still pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 8d2972b commit 62e7bb3

File tree

20 files changed

+2708
-256
lines changed

20 files changed

+2708
-256
lines changed

packages/agent-lightning/README.md

Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
# Agent-Lightning Integration
2+
3+
Train AI agents with RL while maintaining **0% policy violations**.
4+
5+
## 🎯 Overview
6+
7+
This integration combines:
8+
- **Agent-Lightning** = Training/Optimization (the "brains")
9+
- **Agent-OS** = Governance/Safety (the "guardrails")
10+
11+
**Result**: Agents learn to be smart AND safe from the start.
12+
13+
## 🚀 Quick Start
14+
15+
```bash
16+
pip install agent-os-kernel agentlightning
17+
```
18+
19+
```python
20+
from agentlightning import Trainer
21+
from agent_os import KernelSpace
22+
from agent_os.policies import SQLPolicy, CostControlPolicy
23+
from agent_os.integrations.agent_lightning import GovernedRunner, PolicyReward
24+
25+
# 1. Create governed kernel
26+
kernel = KernelSpace(policy=[
27+
SQLPolicy(deny=["DROP", "DELETE"]),
28+
CostControlPolicy(max_cost_usd=100)
29+
])
30+
31+
# 2. Create governed runner for Agent-Lightning
32+
runner = GovernedRunner(kernel)
33+
34+
# 3. Create policy-aware reward function
35+
def base_accuracy(rollout):
36+
return rollout.task_output.accuracy if rollout.success else 0.0
37+
38+
reward_fn = PolicyReward(kernel, base_reward_fn=base_accuracy)
39+
40+
# 4. Train with Agent-Lightning
41+
trainer = Trainer(
42+
runner=runner,
43+
reward_fn=reward_fn,
44+
algorithm="GRPO"
45+
)
46+
47+
trainer.train(num_epochs=100)
48+
```
49+
50+
## 📊 Key Benefits
51+
52+
| Metric | Without Agent-OS | With Agent-OS |
53+
|--------|------------------|---------------|
54+
| Policy Violations | 12.3% | **0.0%** |
55+
| Task Accuracy | 76.4% | **79.2%** |
56+
| Training Stability | Variable | Consistent |
57+
58+
## 🔧 Components
59+
60+
### GovernedRunner
61+
62+
Agent-Lightning runner that enforces policies during execution:
63+
64+
```python
65+
from agent_os.integrations.agent_lightning import GovernedRunner
66+
67+
runner = GovernedRunner(
68+
kernel,
69+
fail_on_violation=False, # Continue but penalize
70+
log_violations=True, # Log all violations
71+
)
72+
73+
# Execute a task
74+
rollout = await runner.step(task_input)
75+
print(f"Violations: {len(rollout.violations)}")
76+
print(f"Total penalty: {rollout.total_penalty}")
77+
```
78+
79+
### PolicyReward
80+
81+
Converts policy violations to RL penalties:
82+
83+
```python
84+
from agent_os.integrations.agent_lightning import PolicyReward, RewardConfig
85+
86+
config = RewardConfig(
87+
critical_penalty=-100.0, # Harsh penalty for critical violations
88+
high_penalty=-50.0,
89+
medium_penalty=-10.0,
90+
low_penalty=-1.0,
91+
clean_bonus=5.0, # Bonus for no violations
92+
)
93+
94+
reward_fn = PolicyReward(kernel, config=config)
95+
96+
# Calculate reward
97+
reward = reward_fn(rollout) # Base reward + policy penalties
98+
```
99+
100+
### GovernedEnvironment
101+
102+
Gym-compatible training environment:
103+
104+
```python
105+
from agent_os.integrations.agent_lightning import GovernedEnvironment
106+
107+
env = GovernedEnvironment(
108+
kernel,
109+
config=EnvironmentConfig(
110+
max_steps=100,
111+
terminate_on_critical=True,
112+
)
113+
)
114+
115+
# Standard Gym interface
116+
state, info = env.reset()
117+
while not env.terminated:
118+
action = agent.get_action(state)
119+
state, reward, terminated, truncated, info = env.step(action)
120+
```
121+
122+
### FlightRecorderEmitter
123+
124+
Export audit logs to LightningStore:
125+
126+
```python
127+
from agent_os import FlightRecorder
128+
from agent_os.integrations.agent_lightning import FlightRecorderEmitter
129+
130+
recorder = FlightRecorder()
131+
emitter = FlightRecorderEmitter(recorder)
132+
133+
# Export to LightningStore
134+
emitter.emit_to_store(lightning_store)
135+
136+
# Or export to file for analysis
137+
emitter.export_to_file("training_audit.json")
138+
139+
# Get violation summary
140+
summary = emitter.get_violation_summary()
141+
print(f"Violation rate: {summary['violation_rate']:.1%}")
142+
```
143+
144+
## 📁 Examples
145+
146+
### SQL Agent Training
147+
148+
```python
149+
# examples/agent-lightning-training/sql_agent.py
150+
151+
from agent_os import KernelSpace
152+
from agent_os.policies import SQLPolicy
153+
from agent_os.integrations.agent_lightning import GovernedRunner, PolicyReward
154+
155+
# Define SQL safety policy
156+
kernel = KernelSpace(policy=SQLPolicy(
157+
allow=["SELECT", "INSERT", "UPDATE"],
158+
deny=["DROP", "DELETE", "TRUNCATE"],
159+
require_where_on_update=True,
160+
))
161+
162+
# Train agent to write safe SQL
163+
runner = GovernedRunner(kernel)
164+
trainer = Trainer(runner=runner, algorithm="GRPO")
165+
trainer.train()
166+
167+
# Result: Agent learns SQL that NEVER violates safety policies
168+
```
169+
170+
### Multi-Agent Training
171+
172+
```python
173+
# examples/agent-lightning-training/multi_agent.py
174+
175+
from agent_os.iatp import Pipeline
176+
from agent_os.integrations.agent_lightning import GovernedRunner
177+
178+
# Create governed multi-agent pipeline
179+
pipeline = Pipeline([
180+
research_agent,
181+
analysis_agent,
182+
report_agent,
183+
])
184+
185+
runner = GovernedRunner(pipeline.kernel)
186+
trainer = Trainer(runner=runner, algorithm="Flow-GRPO")
187+
trainer.train()
188+
```
189+
190+
## 📈 Metrics & Monitoring
191+
192+
Track governance during training:
193+
194+
```python
195+
# Get runner statistics
196+
stats = runner.get_stats()
197+
print(f"Total rollouts: {stats['total_rollouts']}")
198+
print(f"Violation rate: {stats['violation_rate']:.1%}")
199+
200+
# Get reward function statistics
201+
reward_stats = reward_fn.get_stats()
202+
print(f"Avg penalty: {reward_stats['avg_penalty']:.2f}")
203+
print(f"Clean rate: {reward_stats['clean_rate']:.1%}")
204+
205+
# Get environment metrics
206+
env_metrics = env.get_metrics()
207+
print(f"Success rate: {env_metrics['success_rate']:.1%}")
208+
```
209+
210+
## 🔗 Architecture
211+
212+
```
213+
┌─────────────────────────────────────────────────────────┐
214+
│ Agent-Lightning │
215+
│ (Trainer, Algorithm, Store) │
216+
└─────────────────────────────────────────────────────────┘
217+
218+
219+
┌─────────────────────────────────────────────────────────┐
220+
│ GovernedRunner │
221+
│ (Wraps execution with policy checks) │
222+
└─────────────────────────────────────────────────────────┘
223+
224+
225+
┌─────────────────────────────────────────────────────────┐
226+
│ Agent OS Kernel │
227+
│ (Policy Engine, Flight Recorder, Signal Dispatch) │
228+
└─────────────────────────────────────────────────────────┘
229+
230+
┌────────────┼────────────┐
231+
▼ ▼ ▼
232+
┌─────────┐ ┌─────────┐ ┌─────────┐
233+
│ Policies│ │ Audit │ │ Signals │
234+
│ (YAML) │ │ Logs │ │ (POSIX) │
235+
└─────────┘ └─────────┘ └─────────┘
236+
```
237+
238+
## 📋 License
239+
240+
MIT License - Use freely with attribution.
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
[build-system]
2+
requires = ["setuptools>=68.0", "wheel"]
3+
build-backend = "setuptools.build_meta"
4+
5+
[project]
6+
name = "agent-lightning"
7+
version = "1.0.0"
8+
description = "Agent-Lightning RL integration for the Agent Governance Toolkit — governed training with policy enforcement"
9+
readme = "README.md"
10+
license = {text = "MIT"}
11+
requires-python = ">=3.9"
12+
authors = [
13+
{name = "Imran Siddique", email = "agt@microsoft.com"}
14+
]
15+
keywords = [
16+
"ai-agents", "governance", "reinforcement-learning",
17+
"agent-lightning", "agent-os", "enterprise-ai"
18+
]
19+
classifiers = [
20+
"Development Status :: 4 - Beta",
21+
"Intended Audience :: Developers",
22+
"License :: OSI Approved :: MIT License",
23+
"Programming Language :: Python :: 3",
24+
"Programming Language :: Python :: 3.9",
25+
"Programming Language :: Python :: 3.10",
26+
"Programming Language :: Python :: 3.11",
27+
"Programming Language :: Python :: 3.12",
28+
"Topic :: Scientific/Engineering :: Artificial Intelligence",
29+
]
30+
dependencies = []
31+
32+
[project.optional-dependencies]
33+
agent-os = ["agent-os>=1.0.0"]
34+
dev = ["pytest>=7.0", "pytest-cov"]
35+
36+
[project.urls]
37+
Homepage = "https://github.com/microsoft/agent-governance-toolkit"
38+
Repository = "https://github.com/microsoft/agent-governance-toolkit"
39+
"Bug Tracker" = "https://github.com/microsoft/agent-governance-toolkit/issues"
40+
41+
[tool.setuptools.packages.find]
42+
where = ["src"]
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Copyright (c) Microsoft Corporation.
2+
# Licensed under the MIT License.
3+
"""
4+
Agent-Lightning Governance Integration
5+
=======================================
6+
7+
Provides kernel-level safety during Agent-Lightning RL training.
8+
9+
Key components:
10+
- GovernedRunner: Agent-Lightning runner with policy enforcement
11+
- PolicyReward: Convert policy violations to RL penalties
12+
- FlightRecorderEmitter: Export audit logs to LightningStore
13+
- GovernedEnvironment: Training environment with governance constraints
14+
15+
Example:
16+
>>> from agent_lightning_gov import GovernedRunner, PolicyReward
17+
>>> from agent_os import KernelSpace
18+
>>> from agent_os.policies import SQLPolicy
19+
>>>
20+
>>> kernel = KernelSpace(policy=SQLPolicy())
21+
>>> runner = GovernedRunner(kernel)
22+
>>> reward_fn = PolicyReward(kernel, base_reward_fn=accuracy)
23+
"""
24+
25+
from .emitter import FlightRecorderEmitter
26+
from .environment import GovernedEnvironment
27+
from .reward import PolicyReward, policy_penalty
28+
from .runner import GovernedRunner
29+
30+
__all__ = [
31+
"GovernedRunner",
32+
"PolicyReward",
33+
"policy_penalty",
34+
"FlightRecorderEmitter",
35+
"GovernedEnvironment",
36+
]

0 commit comments

Comments
 (0)