Sanity Check Fails

I cloned the git and am trying to run it locally to figure out some dimension issues I'm having in my refactor. In order to run it locally, I changed the model to "gpt2-small", and changed:

```python
device = "cuda"
layers= [7, 14, 21, 40]
l0s = [92, 67, 129, 125]
saes = [SAE.from_pretrained(release="gemma-scope-9b-pt-res",
                             sae_id=f"layer_{layers[i]}/width_16k/average_l0_{l0s[i]}", 
                             device=device)[0] for i in range(len(layers))]`
```

to 

```python
device = "cuda"
layers= [3, 5, 7, 9]
saes = [SAE.from_pretrained(release="jbloom/GPT2-Small-SAEs-Reformatted",
                            sae_id=f"blocks.{layer}.hook_resid_pre", 
                            device=device)[0] for layer in layers]
```

The sanity check running the first ten batches of `clean_tokens` through `model` directly through the `forward` function and inserting the `fwd_hooks` with the `build_hooks_list` function as the value. These give _**very**_ different values. The only similarity is that the signs match. Any idea why this might not be working? I tried limiting to using only two layers, both with 99.9% of variance explained by the SAEs. But the results are still not close. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sanity Check Fails #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Sanity Check Fails #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions