[Feature]: Remove unused embeddings layers when using "layers" parameter lower than -1

### Problem statement

Hi,

I have been using Flair library for 3 years now and have been training multiple different models with my custom datasets (RelationClassifier, RelationExtractor, SequenceTagger, TextClassifier, MultitaskModel). This year I performed layerwise probing on all models (following the Early exitting research at https://arxiv.org/abs/2004.12993 and https://aclanthology.org/2021.eacl-main.8/) and realised that most of the models do not require all 12 encoder layers (or 28 for ModernBERT) to achieve the same performance scores as the fully trained model with 12 layers. So as an optimisation technique I have been retraining all models with  embeddings (TransformerDocumentEmbeddings or TransformerWordEmbeddings) parameter "layers" set to some value like -4 or -6 (to get the embeddings from one of the lower layers instead of the last layer like with the default value of -1).

Nevertheless, it seems that Flair embeddings objects still compute all layers during inference, even if the "layers" parameter is set to a lower layer. So my question is, is this on purpose and, if yes, then why? I don't see any reason why the layers that are not used (or even trained) should stay in the transformer and waste computation so this seems like a bug to me.

To deal with this, after training the model, I had to cut out unused layers from the checkpoint state_dict and update embeddings config. 

### Solution

If this indeed is a bug, as a solution to it, I propose removing of these unused layers inside the embeddings object after using the "layers" parameter and defining from which encoder layer are embeddings taken out. If this is intended to be like it is currently implemented, could you please explain why? Thanks!

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Remove unused embeddings layers when using "layers" parameter lower than -1 #3693

Problem statement

Solution

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Remove unused embeddings layers when using "layers" parameter lower than -1 #3693

Description

Problem statement

Solution

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions