Skip to content

[TRANSLATION] Initial version for vietnamese#223

Merged
burtenshaw merged 15 commits intohuggingface:mainfrom
ngxson:xsn/vi_translation
Mar 5, 2025
Merged

[TRANSLATION] Initial version for vietnamese#223
burtenshaw merged 15 commits intohuggingface:mainfrom
ngxson:xsn/vi_translation

Conversation

@ngxson
Copy link
Copy Markdown
Member

@ngxson ngxson commented Feb 21, 2025

WIP, currently all files are generated by scripts/vi-translation.py without any review.

TODO:

  • @xrsrke and I will review it
  • After that @burtenshaw makes a final approval then we can merge

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ngxson ngxson changed the title add vietnamese translation (wip) [TRANSLATION] Initial version for vietnamese (wip) Feb 21, 2025
Copy link
Copy Markdown
Member Author

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ok interestingly it uses vietnamese grammatically better than me lol

2. **Hướng dẫn chu kỳ** (Thought → Action → Observation)

```
Answer the following questions as best you can. You have access to the following tools:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe provide a translated version of this whole code block

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the translation inside a <details> collapsible block

@ngxson ngxson changed the title [TRANSLATION] Initial version for vietnamese (wip) [TRANSLATION] Initial version for vietnamese Feb 23, 2025
@ngxson ngxson marked this pull request as ready for review February 23, 2025 13:05
@burtenshaw
Copy link
Copy Markdown
Collaborator

Thanks @ngxson !

Do you think it's possible to parameterise the scripts/vi-translation.py so. that we could use it for other languages? For example, by adding a param for --language="vi" then select the prompt. This would mean that other translators just need to convert the prompt.

@ngxson
Copy link
Copy Markdown
Member Author

ngxson commented Feb 24, 2025

Yup I generalized the script, user can create a new python file for each language, for example fr.py, then import theauto_translate function:

from translation import auto_translate

output_lang = "fr"

prompt = lambda content: f'''
The prompt here........

=== BEGIN OF TEXT ===
{content}
=== END OF TEXT ===
'''.strip()

auto_translate(
    prompt=prompt,
    output_lang=output_lang,
)

- title: Khi nào các bước tiếp theo được công bố?
sections:
- local: communication/next-units
title: Các Chương tiếp theo No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not upper case the second word?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah maybe I did a Ctrl+F replace all and forgot about this. Thanks for reminding!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ok seems like not because of that, but the LLM just copy from my example in the prompt

Co-authored-by: XλRI-U5 <phucnh791@gmail.com>
@burtenshaw
Copy link
Copy Markdown
Collaborator

Hey @xrsrke and @ngxson 👋

Where are we on this?

@ngxson
Copy link
Copy Markdown
Member Author

ngxson commented Mar 4, 2025

Yeah sorry, I think we can merge if @xrsrke is ok. Resolving the conflicts now

For unit 2, I'll make a follow up PR

Copy link
Copy Markdown
Contributor

@honghanhh honghanhh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coucou! Big thanks for translating the course into Vietnamese 🙌 ! As a native, I’ve got a few small tweaks to make it even smoother. Overall, great job—just a couple of adjustments and it'll be perfect! Keep up the awesome work!

Comment on lines +7 to +12
- Translate the text into Vietnamese, while keeping the original formatting (either Markdown, MDX or HTML)
- Inside code blocks, translate the comments but leave the code as-is ; If the code block contains quite plain texts, you MUST provide the translation in <details> tag.
- Do not translate inline code, the URLs and file paths
- If the term is abbreviated, keep the original term and provide the translation in parentheses for the first time it appears in the text.
- If there are any slag or funny joke in english, keep it (do not translate) and give an explanation so vietnamese reader can understand.
- Use "ta", "chúng ta", "chúng mình", "các bạn" as pronouns.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full stops aren’t a must, but they do keep things tidy 👀!
Either all has full stop or not would ease the eyes of OCD guys a lot.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I would say full stop is redundant here, should be removed to save some tokens.


KEEP THESE TERMS (DO NOT TRANSLATE, do NOT add translation in parentheses): model, API, SDK, CLI, HTML, GGUF, AI, training, inference, server, client, notebook, python, Hugging Face, transformers, diffusion, diffuser, data, function, LangGraph, LangChain, Llama, Gemma, token, Unit, pretrain, Live (live stream), form, format, certificate, Space, CodeAgent

Also KEEP these terms but PROVIDE TRANSLATION in parentheses for the first time it appears in the text: alignment (cân chỉnh), LLM, RAG (Tìm kiếm và tạo ra câu trả lời), Agent (tác nhân), Tools (công cụ), "Special Token" (Token đặc biệt), "chain-of-thought" (luồng suy luận), fine-tuning (tinh chỉnh), Thought-Action-Observation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either capitalize first letter inside brackets or not for consistency maybe 👀
Why Thought-Action-Observation has no translation (e.g., Suy nghĩ - Hành động - Quan sát)

- Module: Mô-đun
- Lesson ...: Bài ...
- Course: Khóa học
- state-of-the-art: nổi tiếng
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hiện đại nhất, tân tiến nhất, etc. might better describe the meaning of this word. Wdyt?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that could be better. I tend to use short words because the text always get longer when translated to vietnamese; "hiện đại nhất" seems fine in this case


Here is an example:
- Original text: To run the models, we will use [ollama](https://ollama.com), a command line tool that allows you to run LLMs and embedding models from Hugging Face. With ollama, you **don't need** to have access to a server or cloud service to run the models. You can run the models directly **on your computer**.
- Translation: Để chạy các model, ta sẽ sử dụng [ollama](https://ollama.com), một công cụ dòng lệnh cho phép bạn chạy LLMs và embedding models từ Hugging Face. Với ollama, bạn **không cần** phải tạo server hay truy cập API bên thứ 3. Bạn có thể chạy các model trực tiếp **trên máy tính của bạn**.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think embedding models, server can be translated into Vietnamese as well.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

server: no because it's a far too common word in vietnam haha

embedding: I'm not sure how to translate this word without losing its meaning. Many words are non-translatable, so I include an explanation the first time it appears in the document.


Here is another example:
- Original text: The model can then be **aligned** to the creator's preferences. For instance, a customer service chat model that must never be impolite to customers.
- Translation: Model sau đó có thể được **alignment** (cân chỉnh) theo mong muốn của người tạo. Ví dụ: model chat hỗ trợ khách hàng không bao giờ được bất lịch sự.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model can be translated into Vietnamese (mô hình) as well.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it worth translating this word, as it's also very common among vietnamese tech people

Kết thúc chương bổ trợ này, bạn sẽ có thể:

- **Hiểu** cách hoạt động nội bộ của APIs khi sử dụng Tools
- **Tinh chỉnh** model bằng kỹ thuật LoRA
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model is kinda very standard already but for translation, mô hình still something original for vietnamese for all files


- **Hiểu** cách hoạt động nội bộ của APIs khi sử dụng Tools
- **Tinh chỉnh** model bằng kỹ thuật LoRA
- **Triển khai** và **tùy chỉnh** chu trình Thought → Act → Observe để tạo workflow Function-calling mạnh mẽ
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suy nghĩ → Hành động → Quan sát

@@ -0,0 +1,378 @@
# Tin nhắn và Special Token

Giờ ta đã hiểu cách LLM (Language Model) hoạt động, hãy cùng xem **cách chúng tổ chức các phản hồi thông qua chat templates**.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM (Mô hình ngôn ngữ lớn)


## Large Language Model là gì?

LLM (Large Language Model) là một loại model AI **giỏi hiểu và tạo ra ngôn ngữ con người**. Chúng được training trên lượng lớn data văn bản, cho phép học các mẫu, cấu trúc và sắc thái ngôn ngữ. Các model này thường có hàng triệu parameters.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large Language Model = mô hình ngôn ngữ lớn
Các mô hình này thường có hàng triệu tham số.


Gì nữa? Lại Quiz á? Chúng mình biết, chúng mình biết... 😅 Nhưng bài kiểm tra ngắn không chấm điểm này giúp bạn **củng cố các khái niệm quan trọng vừa học**.

Quiz này bao gồm Large Language Models (LLMs), hệ thống tin nhắn và tools - những thành phần thiết yếu để hiểu và xây dựng AI Agent.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mô hình ngôn ngữ lớn (LLM)

@burtenshaw
Copy link
Copy Markdown
Collaborator

Hey Son, could you take at look at @honghanhh 's comments?

Copy link
Copy Markdown
Collaborator

@burtenshaw burtenshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

@honghanhh Thanks for your improvements. Could you please take these up in a new PR? That way, you'll get authorship.

@burtenshaw burtenshaw merged commit f1be499 into huggingface:main Mar 5, 2025
1 check passed
@ngxson
Copy link
Copy Markdown
Member Author

ngxson commented Mar 5, 2025

Hi @honghanhh and thanks for taking time to review this. Sorry I don't have much time this week, so I can't process all of your suggestions.

Would you mind opening a dedicated PR? That will be easier for me to review. Thanks!

lexaneon pushed a commit to lexaneon/hugging-face-agents-course that referenced this pull request Mar 19, 2025
[TRANSLATION] Initial version for vietnamese
giacomosansoni pushed a commit to giacomosansoni/agents-course that referenced this pull request May 17, 2025
[TRANSLATION] Initial version for vietnamese
richtunnel added a commit to richtunnel/agents-course that referenced this pull request Feb 7, 2026
[TRANSLATION] Initial version for vietnamese
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants