Multi-Class Classification Training w/a PyTorch Model #12501

nashcaps2255 · 2023-04-05T17:44:14Z

nashcaps2255
Apr 5, 2023

**From Embeddings & Transformers/Transformers/Training Usage
The name value is the name of any [HuggingFace model], which will be downloaded automatically the first time it’s used. You can also use a local file path. For full details, see the [TransformerModel docs].

A wide variety of PyTorch models are supported, but some might not work. If a model doesn’t seem to work feel free to open an [issue].**

I have GPT model trained with PyTorch (.pt file extension), I would like to experiment with this model as the transformer model in a trained multi-class classification model. I understand how to easily integrate any huggingface model by either passing the name or the local file path, would the same be true for the PyTorch model? Just pass the local file name in the [paths] sections of the config file? Is there a specific PyTorch model type that my model has to be saved as to work with the Spacy training?

Answered by shadeMe

Apr 18, 2023

TransformerModel currently doesn't support directly loading from a PyTorch checkpoint as this functionality is not supported by the HuggingFace transformers.AutoModel class.

One alternative would be to initialize the TransfomerModel with the same architecture as your GPT model, deserializing the weights from the PyTorch checkpoint and manually loading them into the internal HuggingFace transformer model:

# Pseudo-code

model = TransformerModel(name='xlm-roberta-base', ...)
checkpoint = torch.load(PATH_TO_CHECKPOINT)
hf_transformer = model.layers[0]
hf_transformer.load_state_dict(checkpoint['model_state_dict'])

View full answer

shadeMe · 2023-04-18T11:47:51Z

shadeMe
Apr 18, 2023

TransformerModel currently doesn't support directly loading from a PyTorch checkpoint as this functionality is not supported by the HuggingFace transformers.AutoModel class.

One alternative would be to initialize the TransfomerModel with the same architecture as your GPT model, deserializing the weights from the PyTorch checkpoint and manually loading them into the internal HuggingFace transformer model:

# Pseudo-code

model = TransformerModel(name='xlm-roberta-base', ...)
checkpoint = torch.load(PATH_TO_CHECKPOINT)
hf_transformer = model.layers[0]
hf_transformer.load_state_dict(checkpoint['model_state_dict'])

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Multi-Class Classification Training w/a PyTorch Model #12501

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Multi-Class Classification Training w/a PyTorch Model #12501

Uh oh!

nashcaps2255 Apr 5, 2023

Replies: 1 comment

Uh oh!

Uh oh!

shadeMe Apr 18, 2023

nashcaps2255
Apr 5, 2023

shadeMe
Apr 18, 2023