You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Note**: Training requires a CUDA-enabled GPU and takes significant time (~80 models total).
169
171
172
+
### Local Training
173
+
170
174
```bash
171
175
# Using the CLI (recommended - handles all steps automatically)
172
176
./run_llm_stylometry.sh --train
177
+
178
+
# Limit GPU usage if needed
179
+
./run_llm_stylometry.sh --train --max-gpus 4
173
180
```
174
181
175
182
This command will:
@@ -179,6 +186,57 @@ This command will:
179
186
180
187
The training pipeline automatically handles data preparation, model training across available GPUs, and result consolidation. Individual model checkpoints and loss logs are saved in the `models/` directory.
181
188
189
+
### Remote Training on GPU Server
190
+
191
+
For training on a remote GPU server, use the provided `remote_train.sh` script:
192
+
193
+
```bash
194
+
# Start remote training
195
+
./remote_train.sh
196
+
197
+
# You'll be prompted for:
198
+
# - Server address (hostname or IP)
199
+
# - Username
200
+
# - Password (for SSH)
201
+
```
202
+
203
+
This script will:
204
+
1. Connect to your GPU server via SSH
205
+
2. Clone or update the repository in `~/llm-stylometry`
206
+
3. Start training in a `screen` session that persists after disconnection
207
+
4. Allow you to safely disconnect while training continues
208
+
209
+
To monitor training progress:
210
+
```bash
211
+
ssh username@server
212
+
screen -r llm_training # Reattach to training session
213
+
# Press Ctrl+A, then D to detach again
214
+
```
215
+
216
+
### Downloading Trained Models
217
+
218
+
After training completes on a remote server, use `sync_models.sh` to download the models:
219
+
220
+
```bash
221
+
# Download trained models from server
222
+
./sync_models.sh
223
+
224
+
# You'll be prompted for:
225
+
# - Server address
226
+
# - Username
227
+
# - Password
228
+
```
229
+
230
+
This script will:
231
+
1. Verify all 80 models are complete with weights
232
+
2. Create a compressed archive on the server
233
+
3. Download via rsync with progress indication
234
+
4. Extract to your local `~/llm-stylometry/models/` directory
235
+
5. Back up any existing local models
236
+
6. Also sync `model_results.pkl` if available
237
+
238
+
**Note**: The script will only download if all 80 models are complete. If training is still in progress, it will show which models are missing.
0 commit comments