Skip to content

About training setting of Llama3-8B #33

@HXuan-Wang

Description

@HXuan-Wang

很感谢您开源这么有趣且有用的工作,我在复现llama模型的量化结果时发现llama2是可以复现的,但是在llama3上复现不出来,我尝试用llama2的训练设置训练llama3-8B,得到的结果如下:
(main_block_ap.py 39): INFO wikitext2 perplexity: 17.49
(main_block_ap.py 39): INFO c4 perplexity: 20.56
(main_block_ap.py 58): INFO Average Acc: 52.33%
这个结果跟论文中的结果相差很远,但是您没有提供llama3的训练配置,这个可以提供吗

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions