Skip to content

Commit 3042d88

Browse files
add referencce to code path
1 parent 41fae95 commit 3042d88

File tree

1 file changed

+19
-7
lines changed

1 file changed

+19
-7
lines changed

_posts/2025-09-12-debugging-numeric-comparisons-llms.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -95,9 +95,10 @@ Gemma-2-2B-IT internally represent the correct comparator but the last-layer MLP
9595
- [3. Part I — Geometry & Emergence](#3-part-i--geometry--emergence)
9696
- [4. Part II — Readout vs Representation](#4-part-ii--readout-vs-representation)
9797
- [5. Part III — Causal Edits (Patching & Ablations)](#5-part-iii--causal-edits-patching--ablations)
98-
- [6. Discussion](#6-discussion)
99-
- [7. Repro Notes](#7-repro-notes)
100-
- [8. Limitations & Next Steps](#8-limitations--next-steps)
98+
- [6. Limitations & Possible Next steps](#6-limitations--possible-next-steps)
99+
- [7. Appendix](#7-appendix)
100+
- [8. References](#8-references)
101+
- [9. Disclaimer](#9-disclaimer)
101102

102103
---
103104

@@ -388,7 +389,7 @@ Results are not very different from the previous approach with similar trends.
388389
- (e) Work currently analyse only one model **Gemma-2-2b-it**. And also only instruction tuned model. A good study could have been how the last layer of non instruction tuned behaved vs the instruction tuned.
389390
- (f) Doesn't analyse negative numbers in comparison.
390391

391-
## Appendix
392+
## 7. Appendix
392393

393394
### A. Harmful neurons
394395

@@ -482,16 +483,27 @@ Basically trying out to see, h_j how strongly neuron j is firing and gradient pr
482483

483484
Next step would be to multiply it by (1 if Yes else -1) to align by truth.
484485

486+
### D. Code & Data Availability
485487

486-
## References {#references}
488+
All code, notebooks, and datasets used in this analysis are available in the `sprint1` branch of the **som_numeric_comparison** repository on GitHub:
489+
[divyanshsinghvi/som_numeric_comparison · sprint1](https://github.com/divyanshsinghvi/som_numeric_comparison/tree/dsinghvi/sprint1).
490+
491+
Key items include:
492+
- `experiment.ipynb` — main notebook with experiments, visualizations, probing & ablation code
493+
- `dataset_gen1.py` — script to generate numeric comparison datasets
494+
- `gemma_numeric_ab_dataset.jsonl`, `gemma_string_ab_dataset.jsonl` — datasets for numeric vs. string comparisons
495+
- Supporting helper scripts and requirements files for reproducibility
496+
497+
498+
## 8. References {#references}
487499

488500
- Nanda, N., & Bloom, J. (2022). *TransformerLens*. GitHub repository. https://github.com/TransformerLensOrg/TransformerLens
489501
- Alain, G., & Bengio, Y. (2016). Understanding intermediate layers using linear classifier probes. arXiv:1610.01644. https://arxiv.org/abs/1610.01644
490502

491503

492504

493-
## Disclaimer
494-
I only did this research in ~15 hours so there are lot of things unexplored and the quality of work can be significantly improved. Took a lot more time in writing than I expected (probably around 7 hours to refine ) .
505+
## 9. Disclaimer
506+
I only did this research in ~15 hours so there are lot of things unexplored and the quality of work can be significantly improved. Took a lot more time in writing than I expected (probably around 7 hours to refine ) .
495507

496508

497509

0 commit comments

Comments
 (0)