Skip to content

Conversation

@ver217
Copy link
Contributor

@ver217 ver217 commented Feb 6, 2025

📌 Checklist before creating the PR

  • I have created an issue for this PR for traceability
  • The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • I have added relevant tags if possible for us to better distinguish different PRs
  • I have installed pre-commit: pip install pre-commit && pre-commit install

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.

💥 Checklist before requesting a review

  • I have linked my PR to an issue (instruction)
  • My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • I have performed a self-review of my code
  • I have added thorough tests.
  • I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • 🌝 Yes, I do.
  • 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

@ver217 ver217 requested a review from a team as a code owner February 6, 2025 08:53
@ver217 ver217 force-pushed the feature/deepseek-v3 branch from ab91a06 to 3f84584 Compare February 6, 2025 09:22
@ver217 ver217 merged commit 2b415e5 into hpcaitech:main Feb 11, 2025
6 checks passed
@ver217 ver217 deleted the feature/deepseek-v3 branch February 11, 2025 08:11
@xs1997zju
Copy link

@ver217 great job, 想问下,v3-671B, bf16-全量训练,这边用了几机的配置, 能训的最大长度能到几k呢?

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@ver217 great job, I would like to ask, v3-671B, bf16-full training, how many machines are used here, how many k can the maximum length of training be?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants