Skip to content

Commit d8abfd6

Browse files
authored
Update README to remove data efficiency details
Removed specific data efficiency metric from the README.
1 parent f30e668 commit d8abfd6

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,13 +86,12 @@ A family of fully open-source large multimodal models demonstrating
8686
- outperforming **Qwen2.5-VL** in most evaluation tasks.
8787

8888
- **High-Quality Data at Scale**
89-
Meticulously curated **pre-training and SFT data** with rigorous filtering and quality control, achieving **superior data efficiency** with only **64B tokens**.
89+
Meticulously curated **pre-training and SFT data** with rigorous filtering and quality control.
9090
- Concept-balanced, highly diverse, high-quality caption data
9191
- Comprehensive instruction fine-tuning data covering a wide range of tasks
9292

9393
- **Ultra-Efficient Training Framework** Complete end-to-end training framework designed for maximum efficiency:
9494
- $16000 total budget for full model training on A100 GPUs ($0.6 per GPU/Hour)
95-
- 45% HFU efficiency in 8k context length
9695
- Built on **MegatronLM** with support for **MoE**, **FP8**, and **long sequence parallelization**
9796
- Optimized codebase for cost-effective scaling
9897

0 commit comments

Comments
 (0)