Commit fec8763
committed
Reduce embedding encode batch size to prevent GPU OOM
The hardcoded batch_size=256 caused CUDA OOM on the 8GB Vast.ai
GPU when the backend already has models loaded for serving.1 parent b48d733 commit fec8763
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
81 | | - | |
| 81 | + | |
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
| |||
0 commit comments