-
Notifications
You must be signed in to change notification settings - Fork 3.1k
add check total_max_length for generate_func #9338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add check total_max_length for generate_func #9338
Conversation
|
Thanks for your contribution! |
| total_max_length = None | ||
| names = [ | ||
| "total_max_length", | ||
| "max_seq_len", | ||
| "max_position_embeddings", | ||
| "max_sequence_length", | ||
| "seq_length", | ||
| ] | ||
| for name in names: | ||
| total_max_length = self.config.get(name, None) | ||
| if total_max_length is not None: | ||
| break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这一块是否可以直接调用llm_utils.get_model_max_position_embedding函数?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Lint 问题需要安装pre-commit 后格式化代码,参考步骤如下: # 安装
pip install pre-commit
# 在项目文件夹下注册pre-commit,每次commit提交时都会格式化代码
pre-commit install
# 单独处理之前的代码文件
pre-commit run --file XXXX.py |
|
请优先解决单测问题,单测问题会阻断整体项目开发,在解决前无法合入此PR。 |
十分感谢告知存在单测的功能,以便于我更好的复现了自己遇见的问题,问题的首次出现是发生在chatglm_v2模型上的,当input_len + max_length > max_sequence_length时,由于无法按照max_sequence_length进行截断操作,导致到生成的长度超过max_sequence_length时,会发生如下报错: 出现错误后一段时间会产生cuda的报错,复现的方式可以在chatglm_v2的get_config函数中添加max_sequence_length=self.seq_length,而不是采用默认的2048,同时修改ChatGLMv2Test类的_get_input_ids_and_config的sequence_length变量,使其直接为最大值,如下图所示: 感觉这块是由于chatglm_v2测试用例覆盖不够强导致的,进一步定位到了模型文件中的CoreAttention类,但是报错的出现和什么时候发生对query_layer和key_layer的操作挂钩,甚至是对之前相关tensor的操作,print相关tensor时便会出现对应的情况 出现的单测错误的主要原因是因为对于chatglm模型而言,这样的越界访问并不会产生相应的报错,使得采用generate函数和sample函数出现的结果不一致,因为截断后的长度会小于sample未截断的情况,对于其他的模型,在单测函数中对于sequence_length的处理方式也存在一定的不一致性 |
|
This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动,被标记为stale。 |
|
Automatically closed by Paddle-bot. |



PR types
Others
PR changes
Others
Description
该pr在generate函数中添加了对
total_max_length的判定,解决当total_max_length<input_len+max_new_tokens时,可能出现的Error: ../paddle/phi/kernels/funcs/gather.cu.h:60 Assertionindex_value >= 0 && index_value < input_dims[j]failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be less than [8192] and greater than or equal to 0, but received [8192]的相关问题