Skip to content

Commit 1f5ec71

Browse files
committed
Document implemented improvements and production roadmap
1 parent c9c28d9 commit 1f5ec71

File tree

1 file changed

+105
-0
lines changed

1 file changed

+105
-0
lines changed

IMPROVEMENTS.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Key Improvements Based on Tinker Docs Review
2+
3+
## ✅ Implemented
4+
5+
### 1. **Proper Renderers with Loss Masking**
6+
- **What**: Using `renderers.build_supervised_example()` instead of naive tokenization
7+
- **Why**: Ensures loss is only computed on assistant outputs, not prompts. Aligns with model's chat format.
8+
- **Impact**: Fixes fundamental training correctness
9+
- **Code**: See `data_loader.py` - falls back gracefully if tinker-cookbook unavailable
10+
11+
### 2. **LoRA Rank Configuration**
12+
- **What**: Added `lora_rank` parameter (default: 16)
13+
- **Why**: Controls adapter capacity and training cost
14+
- **Impact**: Allows tuning speed/quality tradeoff
15+
- **Code**: `config_schema.py` + passed to `create_lora_training_client(rank=...)`
16+
17+
### 3. **save_weights_for_sampler() for Evaluation**
18+
- **What**: Using `save_weights_for_sampler()` instead of `save_state()` for eval
19+
- **Why**: `save_state()` includes optimizer state (for resuming). Eval only needs weights.
20+
- **Impact**: Faster checkpointing, correct eval pattern
21+
- **Code**: `trainer_with_eval.py` line ~234
22+
23+
### 4. **Recommended Learning Rate Formula**
24+
- **What**: Tinker's LR formula: `LR = 5e-5 × 10 × (2000/H_m)^P_m`
25+
- **Why**: Tuned for LoRA + Tinker infrastructure across model sizes
26+
- **Impact**: Better convergence, less hyperparameter search
27+
- **Code**: `hyperparam_utils.py` - `get_recommended_lr(model_name)`
28+
- **Usage**: Set `use_recommended_lr: true` in config
29+
30+
### 5. **Warmup + Cosine LR Schedule**
31+
- **What**: Linear warmup → cosine decay to `min_lr`
32+
- **Why**: Stabilizes training, standard best practice
33+
- **Impact**: Reduces instability, improves final metrics
34+
- **Code**: `hyperparam_utils.py` - `get_lr_with_warmup(step, ...)`
35+
36+
## 🚧 To Implement (Production-Ready)
37+
38+
### 6. **Async Futures for Overlapping Requests**
39+
- **What**: Use `forward_backward_async()` and `optim_step_async()` with batching
40+
- **Why**: Overlap compute/network, avoid missing Tinker's ~10s clock cycles
41+
- **Impact**: 2-3x faster training throughput
42+
- **Code**: Replace `run_training_round()` with async micro-batching loop
43+
- **Docs**: https://tinker-docs.thinkingmachines.ai/async
44+
45+
```python
46+
async def run_training_round_async(training_client, datums, cfg, step_offset):
47+
batches = [datums[i:i+cfg.batch_size] for i in range(0, len(datums), cfg.batch_size)]
48+
49+
futures = []
50+
for i in range(min(cfg.steps_per_round, len(batches))):
51+
lr = get_lr_with_warmup(step=step_offset + i, ...)
52+
fut = await training_client.forward_backward_async(batches[i], "cross_entropy")
53+
futures.append(fut)
54+
55+
await asyncio.gather(*futures)
56+
await training_client.optim_step_async(types.AdamParams(learning_rate=lr))
57+
```
58+
59+
### 7. **Proper Inspect AI Integration**
60+
- **What**: Use `InspectAPIFromTinkerSampling` with actual Inspect tasks
61+
- **Why**: Real evals instead of simulated scores
62+
- **Impact**: Demo credibility, production evals
63+
- **Code**: In `run_evaluations()`, create sampling client from weights_uri:
64+
65+
```python
66+
from tinker_cookbook.eval.inspect_utils import InspectAPIFromTinkerSampling
67+
from inspect_ai import eval_async
68+
69+
sampling_client = service_client.create_sampling_client(model_path=model_path)
70+
api = InspectAPIFromTinkerSampling(
71+
renderer_name=renderer_name,
72+
model_name=model_name,
73+
sampling_client=sampling_client,
74+
)
75+
results = await eval_async(tasks=tasks, model=InspectAIModel(api=api))
76+
```
77+
78+
### 8. **Batch Multiple Examples Per Step**
79+
- **What**: Currently doing 1 step per round. Should do `steps_per_round` with batching.
80+
- **Why**: Realistic training (need 100+ steps minimum per docs)
81+
- **Impact**: Proper gradient signal, better results
82+
- **Dependency**: Requires #6 (async futures)
83+
84+
### 9. **Resume from Checkpoint**
85+
- **What**: `load_state()` to continue from saved checkpoint
86+
- **Why**: Long training runs, experimentation, recovery
87+
- **Code**: Add `--resume` flag + `load_state(checkpoint_uri)` before loop
88+
89+
### 10. **Model-Specific Defaults**
90+
- **What**: Auto-detect renderer from model name, validate compatibility
91+
- **Why**: Prevent silent failures from renderer/model mismatch
92+
- **Code**: Add model→renderer mapping in config validation
93+
94+
## Priority for Next PR
95+
96+
1. **Async futures + batching** (#6 + #8) - Biggest performance/correctness win
97+
2. **Inspect AI integration** (#7) - Makes demo real and production-ready
98+
3. **Resume capability** (#9) - Common production need
99+
100+
## Notes
101+
102+
- Current implementation works but is simplified for demo purposes
103+
- Renderers fallback gracefully if tinker-cookbook unavailable
104+
- LR scheduler optional - can use static LR for simple demos
105+
- All improvements maintain backward compatibility with existing configs

0 commit comments

Comments
 (0)