Skip to content

Commit fd0b661

Browse files
committed
fix: update image filenames in bias visualization documentation for clarity
1 parent ea96134 commit fd0b661

File tree

2 files changed

+10
-3
lines changed

2 files changed

+10
-3
lines changed

.github/workflows/docs.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,18 @@ on:
66
- 'docs/**'
77
- 'mkdocs.yml'
88

9+
permissions:
10+
contents: write
11+
pages: write
12+
id-token: write
13+
914
jobs:
1015
deploy:
1116
runs-on: ubuntu-latest
1217
steps:
1318
- uses: actions/checkout@v4
19+
with:
20+
token: ${{ secrets.GITHUB_TOKEN }}
1421
- uses: actions/setup-python@v4
1522
with:
1623
python-version: 3.x

docs/bias_visualization.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -153,19 +153,19 @@ The metrics returned by `visualize_bias` include:
153153

154154
### Mean Differences
155155

156-
![Mean Differences Example](images/mean_diff_example.png)
156+
![Mean Differences Example](images/mean_image_differences.png)
157157

158158
This visualization shows how the magnitude of activation differences varies across layers. Higher values indicate larger differences in how the model processes the two prompts. Increasing values in deeper layers often indicate bias amplification through the network.
159159

160160
### Heatmaps
161161

162-
![Heatmap Example](images/heatmap_example.png)
162+
![Heatmap Example](images/activation_differences_layer.png)
163163

164164
Heatmaps show detailed patterns of activation differences within specific layers. Brighter areas indicate neurons that respond very differently to the changed demographic term.
165165

166166
### PCA Analysis
167167

168-
![PCA Example](images/pca_example.png)
168+
![PCA Example](images/pca_analysis.png)
169169

170170
The PCA visualization reduces high-dimensional activations to 2D, showing how token representations shift when changing a demographic term. Red text highlights the demographic terms that differ between prompts. Arrows connect corresponding tokens across the two prompts.
171171

0 commit comments

Comments
 (0)