peft provides the position of the model input? #1570
Replies: 1 comment
-
On the question of why ChatGLM uses Rotary Position Embedding (RoPE) for positional encoding. Since RoPE is a form of relative position embedding, as long as tokens are fed into the model in a continuous manner (i.e., positions follow a natural order like 0, 1, 2, ..., n), the model can compute positional information internally using the RoPE formulation — no need to explicitly supply Regarding how def get_position_ids(self, input_ids, device):
batch_size, seq_length = input_ids.shape
position_ids = torch.arange(seq_length, dtype=torch.long, device=device).unsqueeze(0).repeat(batch_size, 1)
return position_ids In summary: with RoPE, explicit position IDs are optional as long as token positions are consistent — the model can infer them internally. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I went to the official repository of the ChatGLM fine-tuning code, and I asked them why position was not provided for the model. Their reply was that peft provided position, so it is no longer needed. Then I saw the following code in peft's code
It seems that it will process the position, but I looked at the code and didn't understand how it generated the position and provided it to the model.
Help, thank you
Beta Was this translation helpful? Give feedback.
All reactions