Commit ac0071f
Dev (#154)
* Fix(analysis): fix analysis on single card and multi card
* Revert TransformerLens submodule to previous commit
* feat(generate): add override_dtype setting to control activation dtype in GenerateActivationsSettings
* feat(distributed): add get_process_group utility, trying to fix checkpoint saving during sweeps.
* Feat(examples): add training examples
* refactor: use TensorSpecs rather than logging method dispatch for logging with different SAE variants (#130)
* misc: cleanup circuit tracing mode and some distributed utils
* refactor: use TensorSpecs rather than logging method dispatch for logging with different SAE variants
* misc: clean up logging method dispatch
* misc: rename tensor_specs to specs
* style: simplify conditions on optional field
* misc(generate): override_dtype for GenerateActivationsSettings should have a default value of None or pydantic would complain about it
* fix: replace `.item()` with `item(...)` to ensure distributed consistency.
See: pytorch/pytorch#152406
* refactor: use Metric classes for disentangled metric computation
* refactor: use Metric classes to run evaluation (#133)
* feat: support resuming wandb run from training checkpoint
- Add wandb_run_id and wandb_resume config options
- Save wandb run id when saving checkpoint
- Load trainer from checkpoint when from_pretrained_path is set
* fix(activation): make mask/attention_mask on the correct device
* fix(activation): use local_map for mask computation on DTensor to ensure correct device placement
* fix(trainer): use ctx.get() for optional coefficients to prevent KeyError
* feat(train): add checkpoint resume support for crosscoder, clt, lorsa and molt runners
* fix(trainer): correct token count calculation for 2D activation in LORSA training
* chore: remove default extra
* misc(evaluator): add type annotation
* docs: workthrough (WIP)
* ci: install with all extras
* fix: type errors due to torch updates on local_map
* fix(TL): add support for whole qwen3 family & fix inconsistency in tie-word-embed
* feat: conversion methods between lm-saes and saelens
Co-authored-by: Guancheng Zhou <[email protected]>
* Fix(examples): fix the activation_factory settings of lorsa examples
* feat(autointerp): refactor to async & support lorsa
* feat(autointerp): better parallelization with async
* feat(database): show progress for database operations (add analysis & update feature)
* feat(autointerp): better parallelization with async
* feat(database): show progress for database operations (add analysis & update feature)
* misc(ruff): fix ruff & typecheck errors
* feat(autointerp): update ui to support autointerp wo verification
* fix(format): fix pyright issues
* fix(format): fix pyright issues
* fix(misc): remove try-except logics for progress measure in autointerp
* feat(autointerp): support max suppressing logits in autointerp
* feature(autointerp): improved autointerp prompts and support lorsa autointerp with z pattern
* fix(misc): ruff for autointerp
* fix(misc): ruff for autointerp
* refactor: use tanstack start for frontend; make a more neuronpedia-like ui (#146)
* fix(database): deal with none value
* feat(ui): support paged queries of samples
* feat(ui): set scrollbar-gutter to ensure space reserved for scrollbar to prevent layout shiftingg
* style(ui): fix eslint & prettier
* feat(ui): interpretation with real data
* ci(ui): add eslint & prettier check
* chore: fix pre-commit for both python and typescript
* format(ui): fix eslint & prettier
* format(ui): adjust eslint rules
* fix: pin torch==2.8.0 for dtensor compatibility.
- Pin torch version to 2.8.0 to avoid dtensor-related errors in 2.9.0
- Remove unused d_model field from LanguageModelConfig
- Add GPU memory usage display in training progress bar
- Move batch refresh to end of training loop iteration
* feat(ui): dictionary page (WIP)
* feat(ui): dictionary page
* feat(ui): feature list in feature page (WIP)
* fix(ui): feature list loading previous page causes wrong scroll position
* fix(ui): reinitialize useFeatures hook when concerned feature index out of range
* fix(ui): fix feature list height
* fix(metric): support inconsistent batch size
* perf(ui): fetch sample range on demand
* feat(server): support preloading models/saes
* feat(ui): remove in card progress bar
* fix(ui): fix visible range comparison
* feat(ui): adjust accent color
* fix(optim): add DTensor support for SparseAdam, redistribute grad to match parameter's placements when grad is DTensor
* feat(circuit): Major revision. 1. Support circuit tracing with plt+lorsa and plt only. wrap list of plts into Trancoder Set, following circuit tracer. 2. update QK tracing. Now we can see feature-feature pairwise attribution. Efficiency might require revisiting. 3. refactor attribution sturcture. Breaking down several heavy files. Ready to be further improved, mainly in reducing numerous if use_lorsa branches
* feat(backend): add DTensor support to TransformerLensLanguageModel
* feat(backend): add DTensor support to TransformerLensLanguageModel
- Add device_mesh parameter to support distributed inference
- Implement forward() with local_map for DTensor inputs
- Add run_with_hooks() that wraps hooks to convert between DTensor and local tensor
- Update to_activations() to return DTensor when device_mesh is set
* fix(backend): convert placements list to tuple for DTensor comparison
* feat(backend): add run_with_cache and hooks context manager with DTensor support
* fix(backend): skip n_context sync when already provided in to_activations
* fix(attribution): fix missing gradient flow configuration for lorsa QKnorm (#149)
* fix(server): expose lru_cache ability from synchronized decorater
* fix(lorsa): fix lorsa init
* fix(lorsa): fix set decoder norm for lorsa
* feat(ui): simply move the original circuit page to ui-ssr
* misc(ui): remove comments
* refactor(ui): split data and visual states; move up feature fetching logic
* refactor(ui): remove standalone CircuitVisualization component
* chore(dependencies): update torch and torch-npu versions to 2.9.0
* fix(lorsa): avoid triggering DTensor bug in torch==2.9.0
* feat(lorsa): Init lorsa with the active subspace of V.
* feat(metrics): add GradientNormMetric and extend Record with reduction modes
* feat(training): support training lorsa with varying lengths of training sequences. This will lead to total number of training tokens inaccurate (#150)
* fix(attribution): fix missing gradient flow configuration for lorsa QK norm
* fix(attribution): fix missing gradient flow configuration for lorsa QK norm
* feat(config): remove all instances of use_batch_norm_mse. We do not want this from now
* fix(activation): load saved mask and attention_mask
* feat(training): support training lorsa with varying lengths of training sequences. This will lead to total number of training tokens inaccurate.
* feat(training): support training lorsa with varying lengths of training sequences. This will lead to total number of training tokens inaccurate.
* misc(lorsa): use abstract_sae computeloss; put l_rec.mean() after loss dict
* fix(training): also transform batch['mask'] to Tensor from DTensor in… (#152)
* fix(training): also transform batch['mask'] to Tensor from DTensor in distributed scenarios
* fix(training): also transform batch['mask'] to Tensor from DTensor in distributed scenarios
* feat(optim): add custom gradient norm computation and clipping for distributed training
* fix: compute_loss DTensor loss shape
* fix(misc): we do not want to filter out eos. It might be end of chats and include useful information
* feat: circuit tracing with backend interaction
* fix(server): synchronized decorator type issue
* fix(backend): use TokenizerFast for trace token origins
* fix(ui): better display dead feature
* fix(ui): correctly display truncated z pattern
* fix(ui): minor layout issues
* fix(attribution): add details to some comments
* perf(ui): better visual display for circuit (WIP)
* fix(trainer): remove assertion for clip_grad_norm in distributed training
* feat(transcoder): init transcoder with MLP.
* fix(tc): fix type problem
* fix(tc): fix type problem
* feat(ui): hover & click nearest node
* feat(analyze): make FeatureAnalyzer aware of mask
* docs: update installation instructions and example README
* fix(runner): type mismatch
* chore: bump version to 2.0.0b4
---------
Co-authored-by: Junxuan Wang <[email protected]>
Co-authored-by: frankstein <[email protected]>
Co-authored-by: Jiaxing Wu <[email protected]>
Co-authored-by: Zhengfu He@SII <[email protected]>
Co-authored-by: Guancheng Zhou <[email protected]>
Co-authored-by: Guancheng Zhou <[email protected]>
Co-authored-by: StarDust73 <[email protected]>File tree
159 files changed
+13359
-3449
lines changed- .github/workflows
- docs
- examples
- reproduce_evolution_of_concepts
- server
- src/lm_saes
- activation/processors
- analysis
- autointerp
- post_analysis
- backend
- circuit
- utils
- runners
- utils
- distributed
- tests/unit
- ui-ssr
- .vscode
- public
- src
- api
- components
- app
- circuits
- link-graph
- dictionary
- feature
- ui
- hooks
- integrations/tanstack-query
- lib
- routes
- types
- utils
- ui
- src
- components
- app
- feature
- types
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
159 files changed
+13359
-3449
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
56 | | - | |
| 57 | + | |
57 | 58 | | |
58 | 59 | | |
59 | 60 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
33 | 49 | | |
34 | 50 | | |
35 | 51 | | |
36 | | - | |
| 52 | + | |
37 | 53 | | |
38 | 54 | | |
39 | 55 | | |
| |||
47 | 63 | | |
48 | 64 | | |
49 | 65 | | |
50 | | - | |
| 66 | + | |
51 | 67 | | |
52 | 68 | | |
53 | 69 | | |
54 | | - | |
55 | | - | |
56 | 70 | | |
57 | 71 | | |
58 | | - | |
| 72 | + | |
59 | 73 | | |
60 | 74 | | |
61 | 75 | | |
| |||
65 | 79 | | |
66 | 80 | | |
67 | 81 | | |
68 | | - | |
| 82 | + | |
69 | 83 | | |
70 | 84 | | |
71 | 85 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
| 64 | + | |
64 | 65 | | |
65 | 66 | | |
66 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
67 | 71 | | |
68 | 72 | | |
69 | 73 | | |
70 | 74 | | |
71 | 75 | | |
72 | | - | |
73 | | - | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
74 | 80 | | |
75 | | - | |
76 | | - | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
77 | 86 | | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
78 | 98 | | |
79 | 99 | | |
80 | | - | |
| 100 | + | |
81 | 101 | | |
82 | 102 | | |
83 | 103 | | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
| 104 | + | |
| 105 | + | |
89 | 106 | | |
90 | 107 | | |
91 | 108 | | |
92 | | - | |
| 109 | + | |
93 | 110 | | |
94 | | - | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
95 | 116 | | |
96 | | - | |
| 117 | + | |
97 | 118 | | |
98 | 119 | | |
99 | 120 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
Lines changed: 62 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
0 commit comments