Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ def per_token_group_quant_fp8(
if HAS_SGL_KERNEL:
finfo = torch.finfo(dtype)
fp8_max, fp8_min = finfo.max, finfo.min
sgl_ops.sgl_per_token_group_quant_fp8(x, x_q, x_s, group_size, 1e-10, fp8_min, fp8_max)
sgl_ops.sgl_per_token_group_quant_fp8(x, x_q, x_s, group_size, 1e-10, fp8_min, fp8_max, False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The eps value is hardcoded as 1e-10 in the function call, but the function already receives an eps parameter. It's better to use the provided parameter to make the function more flexible and to respect the function's contract. Also, the False argument is a "magic value" which makes the code harder to understand without context. Please consider adding a comment explaining its purpose, or use a named argument if the sgl_ops API supports it.

sgl_ops.sgl_per_token_group_quant_fp8(x, x_q, x_s, group_size, eps, fp8_min, fp8_max, False) # eps from function param

else:
lightllm_per_token_group_quant_fp8(x, group_size, x_q, x_s, eps=1e-10, dtype=torch.float8_e4m3fn)

Expand Down
3 changes: 3 additions & 0 deletions lightllm/utils/config_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,9 @@ def get_model_architectures(model_path: str):
def get_vocab_size(model_path: str):
try:
config_json = get_config_json(model_path)
if "llm_config" in config_json:
vocab_size = int(config_json["llm_config"]["vocab_size"])
return vocab_size
Comment on lines +47 to +49
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This new block of code duplicates the logic for extracting and casting vocab_size that exists in lines 50-52. This code duplication can make future maintenance more difficult, as any changes would need to be applied in two places. To improve this, you could refactor the function to first select the correct configuration dictionary and then apply the vocab_size extraction logic just once.

config_json = get_config_json(model_path)

    # Select the right config dictionary
    if "llm_config" in config_json:
        config_json = config_json["llm_config"]

    # Extract vocab_size from the selected config
    vocab_size = config_json["vocab_size"]
    if not isinstance(vocab_size, int):
        vocab_size = int(vocab_size)
    return vocab_size

vocab_size = config_json["vocab_size"]
if not isinstance(vocab_size, int):
vocab_size = int(vocab_size)
Expand Down