about LlamaEncoderModel vs LlamaBiModel

During training, the **LlamaBiModel** class was used, which modifies the **_update_causal_mask** function from LlamaModel.

However, I noticed that the public model on HuggingFace uses the **LlamaEncoderModel** class from **modeling_llama_encoder.py** when loading for inference. This class modifies the **forward** function from LlamaModel.

Why is there a difference between training and inference? Are they functionally equivalent? I'm a bit confused.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

about LlamaEncoderModel vs LlamaBiModel #60

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

about LlamaEncoderModel vs LlamaBiModel #60

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions