@@ -63,3 +63,68 @@ Train remotely on GPU cluster:
6363```
6464
6565See main README for full training documentation.
66+
67+ ## HuggingFace Model Training (High-Quality Public Models)
68+
69+ Train high-quality single models (one per author) for public release on HuggingFace.
70+
71+ ### Training Commands
72+
73+ ** Local training:**
74+ ``` bash
75+ ./train_hf_models.sh --author baum # Single author, 50k epochs
76+ ./train_hf_models.sh --all # All 8 authors in parallel
77+ ```
78+
79+ ** Remote GPU training:**
80+ ``` bash
81+ ./remote_train_hf.sh --cluster mycluster --all # All 8 authors
82+ ./remote_train_hf.sh --cluster mycluster --author baum # Single author
83+ ./remote_train_hf.sh --cluster mycluster --all --max-epochs 100000 # Higher limit
84+ ```
85+
86+ ** Monitor training:**
87+ ``` bash
88+ ./check_hf_status.sh --cluster mycluster
89+ # Shows current epoch, loss, progress, ETA per author
90+ ```
91+
92+ ** Download completed models:**
93+ ``` bash
94+ ./sync_hf_models.sh --cluster mycluster --all
95+ # Downloads to models_hf/{author}_tokenizer=gpt2/
96+ ```
97+
98+ ### Uploading to HuggingFace
99+
100+ ** Prerequisites:**
101+ - Credentials file: ` .huggingface/credentials.json `
102+ - Format: ` {"username": "contextlab", "token": "hf_..."} `
103+ - Completed models in ` models_hf/ ` directory
104+
105+ ** Generate model cards:**
106+ ``` bash
107+ python code/generate_model_card.py --author baum --model-dir models_hf/baum_tokenizer=gpt2
108+ ```
109+
110+ ** Upload models:**
111+ ``` bash
112+ ./upload_to_huggingface.sh --author baum --dry-run # Test
113+ ./upload_to_huggingface.sh --author baum # Upload
114+ ./upload_to_huggingface.sh --all # Upload all
115+ ```
116+
117+ Models published to: ` contextlab/gpt2-{author} ` (e.g., contextlab/gpt2-baum)
118+
119+ ### HuggingFace vs Paper Models
120+
121+ ** Paper models:** 320 models (8 authors × 10 seeds × 4 conditions)
122+ - Used for all figures and statistical analysis
123+ - Trained to loss ≤ 3.0 for consistent comparison
124+ - Available via Dropbox download
125+
126+ ** HuggingFace models:** 8 models (1 per author)
127+ - For public use and text generation
128+ - Trained for 50,000 additional epochs beyond paper models
129+ - Much lower loss (~ 1.3-1.6) for better generation quality
130+ - Will be available at https://huggingface.co/contextlab
0 commit comments