Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
torch>=1.13.1
huggingface-hub==0.29.1
transformers==4.49.0
huggingface-hub>=0.34.0,<1.0
transformers>=4.51.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Add an upper bound to transformers to prevent future major‑version breakage.

Guard against a v5 jump changing APIs.

Apply:

-transformers>=4.51.0
+transformers>=4.51.0,<5.0

Also, verify 4.51.x remains compatible with your pinned Torch (torch>=1.13.1) and accelerate (>=0.27.2); Transformers’ optional PyTorch features may expect newer Torch.


🌐 Web query:

What PyTorch versions are officially supported by Transformers 4.51.x, and does it require Python >=3.9?

💡 Result:

Short answer: Transformers 4.51.x requires Python >= 3.9 and is intended to be used with PyTorch 2.0 or newer (i.e., PyTorch 2.0+). [1][2]

Sources:

  • transformers 4.51.3 (PyPI) — Requires: Python >=3.9; "Transformers works with ... PyTorch 2.0+." [1]
  • Transformers installation docs (nearby 4.x releases) confirming PyTorch 2.x compatibility and Python 3.9+ testing. [2]

Add upper bound to transformers and resolve PyTorch/Python compatibility

Apply:

-transformers>=4.51.0
+transformers>=4.51.0,<5.0
  • File: requirements.txt (line 3).
  • Rationale: Transformers 4.51.x requires Python >=3.9 and PyTorch 2.0+, so the current torch>=1.13.1 is incompatible. Either upgrade torch to 2.0+ (and confirm accelerate compatibility) or pin transformers to a 4.x release that supports torch 1.13.1. Verify CI/runtime Python version is >=3.9.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
transformers>=4.51.0
transformers>=4.51.0,<5.0
🤖 Prompt for AI Agents
In requirements.txt around line 3, transformers is pinned to >=4.51.0 which
requires Python >=3.9 and PyTorch 2.0+, causing incompatibility with the repo's
torch>=1.13.1; either (A) upgrade torch to >=2.0.0 (and ensure accelerate is
compatible) and confirm CI/runtime Python >=3.9, or (B) constrain transformers
to a 4.x release compatible with torch 1.13.1 (e.g., <=4.50.*) and verify
CI/runtime Python is still supported; update the requirements.txt accordingly
and run CI to ensure all dependency combinations pass.

datasets>=2.14.3
accelerate>=0.27.2
loguru==0.7.0
Expand Down
57 changes: 57 additions & 0 deletions src/core/constant.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
"Qwen/Qwen2.5-32B-Instruct",
"Qwen/Qwen2.5-72B",
"Qwen/Qwen2.5-72B-Instruct",
"Qwen/Qwen3-4B-Instruct-2507",
# yi 1.5
"01-ai/Yi-1.5-6B",
"01-ai/Yi-1.5-6B-Chat",
Expand Down Expand Up @@ -50,3 +51,59 @@
"microsoft/Phi-4-mini-instruct",
"microsoft/phi-4",
]

MODEL_TEMPLATE_MAP = {
# Qwen
"Qwen/Qwen2.5-0.5B": "qwen1.5",
"Qwen/Qwen2.5-0.5B-Instruct": "qwen1.5",
"Qwen/Qwen2.5-1.5B": "qwen1.5",
"Qwen/Qwen2.5-1.5B-Instruct": "qwen1.5",
"Qwen/Qwen2.5-3B": "qwen1.5",
"Qwen/Qwen2.5-3B-Instruct": "qwen1.5",
"Qwen/Qwen2.5-7B": "qwen1.5",
"Qwen/Qwen2.5-7B-Instruct": "qwen1.5",
"Qwen/Qwen2.5-14B": "qwen1.5",
"Qwen/Qwen2.5-14B-Instruct": "qwen1.5",
"Qwen/Qwen2.5-32B": "qwen1.5",
"Qwen/Qwen2.5-32B-Instruct": "qwen1.5",
"Qwen/Qwen2.5-72B": "qwen1.5",
"Qwen/Qwen2.5-72B-Instruct": "qwen1.5",
"Qwen/Qwen3-4B-Instruct-2507": "qwen3",
# Yi
"01-ai/Yi-1.5-6B": "yi",
"01-ai/Yi-1.5-6B-Chat": "yi",
"01-ai/Yi-1.5-9B": "yi",
"01-ai/Yi-1.5-9B-Chat": "yi",
"01-ai/Yi-1.5-34B": "yi",
"01-ai/Yi-1.5-34B-Chat": "yi",
# Mistral
"mistralai/Mistral-7B-v0.3": "mistral",
"mistralai/Mistral-7B-Instruct-v0.3": "mistral",
"mistralai/Ministral-8B-Instruct-2410": "mistral",
# Mixtral
"mistralai/Mixtral-8x7B-v0.1": "mixtral",
"mistralai/Mixtral-8x7B-Instruct-v0.1": "mixtral",
# Gemma 2
"google/gemma-2-2b": "gemma",
"google/gemma-2-9b": "gemma",
"google/gemma-2-27b": "gemma",
"google/gemma-2-2b-it": "gemma",
"google/gemma-2-9b-it": "gemma",
"google/gemma-2-27b-it": "gemma",
# LLaMA 3 + 3.1
"meta-llama/Meta-Llama-3-8B": "llama3",
"meta-llama/Meta-Llama-3-8B-Instruct": "llama3",
"meta-llama/Meta-Llama-3-70B": "llama3",
"meta-llama/Meta-Llama-3-70B-Instruct": "llama3",
"meta-llama/Meta-Llama-3.1-8B": "llama3",
"meta-llama/Meta-Llama-3.1-8B-Instruct": "llama3",
"meta-llama/Meta-Llama-3.1-70B": "llama3",
"meta-llama/Meta-Llama-3.1-70B-Instruct": "llama3",
# Phi 3
"microsoft/Phi-3.5-mini-instruct": "phi3",
"microsoft/Phi-3-mini-4k-instruct": "phi3",
"microsoft/Phi-3-medium-4k-instruct": "phi3",
# Phi 4
"microsoft/Phi-4-mini-instruct": "phi4",
"microsoft/phi-4": "phi4",
}
23 changes: 23 additions & 0 deletions src/core/template.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from dataclasses import dataclass
from typing import Dict
from constant import MODEL_TEMPLATE_MAP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use package-relative import to avoid import errors with src/ layout.

Direct from constant import ... will fail when imported as a package. Prefer a relative import (optionally keep a fallback for ad-hoc scripts).

-from constant import MODEL_TEMPLATE_MAP
+try:
+    from .constant import MODEL_TEMPLATE_MAP  # package-relative
+except ImportError:  # pragma: no cover
+    from constant import MODEL_TEMPLATE_MAP   # fallback for direct execution
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from constant import MODEL_TEMPLATE_MAP
try:
from .constant import MODEL_TEMPLATE_MAP # package-relative
except ImportError: # pragma: no cover
from constant import MODEL_TEMPLATE_MAP # fallback for direct execution
🤖 Prompt for AI Agents
In src/core/template.py around line 3, the direct top-level import "from
constant import MODEL_TEMPLATE_MAP" will break when the package is imported;
change to a package-relative import and optionally provide a fallback for ad-hoc
execution: use "from .constant import MODEL_TEMPLATE_MAP" as the primary import
and wrap it in a try/except that falls back to "from constant import
MODEL_TEMPLATE_MAP" if the relative import fails, so both package usage and
standalone/script runs work.



@dataclass
Expand Down Expand Up @@ -67,6 +68,25 @@ def register_template(
stop_word="<|im_end|>",
)

register_template(
template_name="qwen3",
system_format="<|im_start|>system\n{content}<|im_end|>\n",
user_format="<|im_start|>user\n{content}<|im_end|>\n<|im_start|>assistant\n",
assistant_format="{content}<|im_end|>\n",
tool_format=(
"# Tools\n\n"
"You may call one or more functions to assist with the user query.\n\n"
"You are provided with function signatures within <tools></tools> XML tags:\n"
"<tools>\n{content}\n</tools>\n\n"
"For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n"
'<tool_call>\n{"name": <function-name>, "arguments": <args-json-object>}\n</tool_call>'
),
function_format="<tool_call>\n{content}\n</tool_call><|im_end|>\n",
observation_format="<|im_start|>user\n<tool_response>\n{content}\n</tool_response><|im_end|>\n<|im_start|>assistant\n",
system="You are a helpful assistant.",
stop_word="<|im_end|>",
)

register_template(
template_name="yi",
system_format="<|im_start|>system\n{content}<|im_end|>\n",
Expand Down Expand Up @@ -182,3 +202,6 @@ def register_template(
system=None,
stop_word="<|end|>",
)

for model_name, template_name in MODEL_TEMPLATE_MAP.items():
template_dict[model_name] = template_dict[template_name]
Comment on lines +206 to +207
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Copy and validate when aliasing templates to avoid KeyError and shared-state bugs.

  • Guard against mappings to unknown template names.
  • Create a new Template instance per alias to avoid accidental cross-mutation.
-for model_name, template_name in MODEL_TEMPLATE_MAP.items():
-    template_dict[model_name] = template_dict[template_name]
+missing = []
+for model_name, template_name in MODEL_TEMPLATE_MAP.items():
+    base = template_dict.get(template_name)
+    if base is None:
+        missing.append((model_name, template_name))
+        continue
+    # copy to avoid shared object between aliases
+    template_dict[model_name] = replace(base)
+if missing:
+    raise KeyError(f"MODEL_TEMPLATE_MAP points to unknown template(s): {missing}")

Add import (outside the changed hunk) to support replace:

-from dataclasses import dataclass
+from dataclasses import dataclass, replace
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for model_name, template_name in MODEL_TEMPLATE_MAP.items():
template_dict[model_name] = template_dict[template_name]
# (update import at top of file)
from dataclasses import dataclass, replace
Suggested change
for model_name, template_name in MODEL_TEMPLATE_MAP.items():
template_dict[model_name] = template_dict[template_name]
missing = []
for model_name, template_name in MODEL_TEMPLATE_MAP.items():
base = template_dict.get(template_name)
if base is None:
missing.append((model_name, template_name))
continue
# copy to avoid shared object between aliases
template_dict[model_name] = replace(base)
if missing:
raise KeyError(f"MODEL_TEMPLATE_MAP points to unknown template(s): {missing}")
🤖 Prompt for AI Agents
In src/core/template.py around lines 206-207, the aliasing loop blindly assigns
references causing KeyError for unknown targets and shared-state bugs; change it
to first check that template_name exists in template_dict and raise or log a
clear KeyError with context if not, then assign a fresh copy for the alias
(e.g., template_dict[model_name] = deepcopy(template_dict[template_name]] ) or
construct a new Template instance from the source so the alias does not share
mutable state; also add the needed import at the top (from copy import deepcopy)
to support the replacement.