Skip to content

Commit 421b429

Browse files
Fixes
1 parent 3c8ec55 commit 421b429

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

_posts/2025-09-12-debugging-numeric-comparisons-llms.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: "Debugging Numeric Comparisons in LLMs"
3-
subtitle: "Why models like Gemma-2-2B-IT fail at `9.8 < 9.11`"
3+
subtitle: "Why models like Gemma-2-2B-IT fail at `9.11 > 9.8`"
44
date: 2025-09-12
55
description: "Layerwise geometry shows the model internally separates Yes/No, but a last-layer readout corrupts the decision—especially for decimals."
66
tags: [LLM, Mechanistic Interpretability, Numeric Reasoning, Debugging]
@@ -14,14 +14,14 @@ author: divyansh singhvi and LLMs
1414
Study on Gemma-2-2B-IT
1515

1616
> **TL;DR:**
17-
> LLMs *internally* represents the correct way to compare numbers (80–90% accuracy in penultimate layers) but the **final layer corrupts this knowledge**, causing simple failures like `9.8 < 9.11`.
17+
> LLMs *internally* represents the correct way to compare numbers (80–90% accuracy in penultimate layers) but the **final layer corrupts this knowledge**, causing simple failures like `9.11 > 9.8`.
1818
1919
---
2020

2121
# Executive Summary
2222

2323
## The Problem: A paradox in LLM Capabilities
24-
LLMs can ace complex reasoning yet still fail at simple numeric comparisons like 9.8 < 9.11. Using Gemma-2-2B-IT, I ask: **Does the model internally represent the correct Yes/No decision, and if so, where does the failure happen?** This matters because robust numeric comparison is a prerequisite for any downstream task that relies on arithmetic or ordering.
24+
LLMs can ace complex reasoning yet still fail at simple numeric comparisons like 9.11 > 9.8. Using Gemma-2-2B-IT, I ask: **Does the model internally represent the correct Yes/No decision, and if so, where does the failure happen?** This matters because robust numeric comparison is a prerequisite for any downstream task that relies on arithmetic or ordering.
2525

2626
## High-level takeaways
2727

@@ -103,7 +103,7 @@ Gemma-2-2B-IT internally represent the correct comparator but the last-layer MLP
103103
---
104104

105105
## 1. Motivation
106-
Large Language Models (LLMs) have demonstrated Olympiad-level performance in complex reasoning, yet they paradoxically stumble on fundamental operations like basic numeric comparisons `9.8 < 9.11`. If a model cannot reliably perform basic numeric comparisons, its utility in downstream tasks that depend on such reasoning is severely compromised. This study investigates why a model like Gemma-2-2B-IT fails at these seemingly simple evaluations.
106+
Large Language Models (LLMs) have demonstrated Olympiad-level performance in complex reasoning, yet they paradoxically stumble on fundamental operations like basic numeric comparisons `9.11 > 9.8`. If a model cannot reliably perform basic numeric comparisons, its utility in downstream tasks that depend on such reasoning is severely compromised. This study investigates why a model like Gemma-2-2B-IT fails at these seemingly simple evaluations.
107107

108108
---
109109

@@ -384,10 +384,10 @@ Results are not very different from the previous approach with similar trends.
384384

385385
- (a) **Prompt diversity** : Have analysed a single prompt. Can try for different types of prompts.
386386
- (b) **String Analysis** : More exploration needed why strings were biased towards *No*.
387-
- (c) Tokenization was not deeply analysed. Only observation was that every number and comparator operator was getting tokenized individually.
387+
- (c) Tokenization was not deeply analysed.
388388
- (d) Didn't try **PCA post ablation** to see how the geometry changes.
389389
- (e) Work currently analyse only one model **Gemma-2-2b-it**. And also only instruction tuned model. A good study could have been how the last layer of non instruction tuned behaved vs the instruction tuned.
390-
- (f) Doesn't analyse negative numbers in comparison.
390+
- (f) Doesn't analyse negative numbers in comparison.
391391

392392
## 7. Appendix
393393

0 commit comments

Comments
 (0)