You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-67Lines changed: 15 additions & 67 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -105,83 +105,31 @@ See the [Package API](#package-api) section for all available functions.
105
105
106
106
## Analysis Variants
107
107
108
-
The project supports three linguistic variants to understand what stylistic features models learn:
108
+
The paper analyzes three linguistic variants (Supplemental Figures S1-S8):
109
109
110
-
**Content-Only** (`-co`, `--content-only`): Masks function words with `<FUNC>`, preserving only content words (nouns, verbs, adjectives). Tests vocabulary and word choice.
111
-
112
-
**Function-Only** (`-fo`, `--function-only`): Masks content words with `<CONTENT>`, preserving only function words (articles, prepositions, conjunctions). Tests grammatical structure.
113
-
114
-
**Part-of-Speech** (`-pos`, `--part-of-speech`): Replaces words with POS tags (Universal Dependencies tagset). Tests syntactic patterns.
115
-
116
-
All CLI commands accept variant flags. Without a flag, the baseline condition is used. Each variant trains 80 models (8 authors × 10 seeds). See [Training Models from Scratch](#training-models-from-scratch) for training details.
110
+
-**Content-only**: Function words masked → tests vocabulary/word choice (Supp. Figs. S1, S4, S7A, S8A)
Variant models converge much faster than baseline models (all cross 3.0 loss by epochs 15-16) and may converge to different final losses. To ensure fair comparison, **fairness-based loss thresholding** is automatically applied to variant figures (1A, 1B, 3, 4, 5):
139
-
140
-
1.**Compute threshold**: Maximum of all models' minimum training losses within 500 epochs
141
-
2.**Truncate data**: Keep all epochs up to and including the first epoch where training loss ≤ threshold
142
-
3.**Fair comparison**: All models compared at the same training loss level (the fairness threshold)
143
-
144
-
This ensures models are not unfairly compared when some converged to higher losses than others. The feature is enabled by default for variants and can be disabled:
from llm_stylometry.visualization import generate_all_losses_figure
177
-
fig = generate_all_losses_figure(
178
-
data_path='data/model_results_function.pkl',
179
-
variant='function',
180
-
apply_fairness=True# default for variants
181
-
)
127
+
**Statistical analysis:**
128
+
```bash
129
+
./run_stats.sh # All variants (default)
182
130
```
183
131
184
-
**Note**: T-test figures (2A, 2B) never apply fairness thresholding since they require all 500 epochs for statistical calculations.
132
+
**Fairness-based loss thresholding:** Automatically ensures fair comparison when variant models converge to different final losses. Disable with `--no-fairness` if needed.
0 commit comments