Skip to content

ValueError("[E203] If the tok2vec embedding layer is not updated during training, make sure to include it in 'annotating components'") #12821

@NeoKun004

Description

@NeoKun004

after the second batch of my data , i m getting this error :
Aborting and saving the final best model. Encountered exception:
ValueError("[E203] If the tok2vec embedding layer is not updated during
training, make sure to include it in 'annotating components'")

my output :
Error: /content/model - No such file or directory.
index batch : 0
epoch : 0
epoch : 1
epoch : 2
epoch : 3
epoch : 4
epoch : 5
epoch : 6
epoch : 7
epoch : 8
epoch : 9
f_score : 0.6586826347305389
✔ Created output directory: /content/model/output
ℹ Saving to output directory: /content/model/output
ℹ Using CPU
ℹ To switch to GPU 0, use the option: --gpu-id 0

=========================== Initializing pipeline ===========================
[2023-07-13 09:28:30,226] [WARNING] [W112] The model specified to use for initial vectors (en_core_web_sm) has no vectors. This is almost certainly a mistake.
WARNING:spacy:[W112] The model specified to use for initial vectors (en_core_web_sm) has no vectors. This is almost certainly a mistake.
✔ Initialized pipeline

============================= Training pipeline =============================
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.001
E # LOSS TOK2VEC LOSS NER ENTS_F ENTS_P ENTS_R SCORE


0 0 0.00 69.53 10.17 7.89 14.29 0.10
16 50 53.74 2167.21 54.55 75.00 42.86 0.55
33 100 10.56 192.28 65.00 68.42 61.90 0.65
50 150 0.00 0.01 70.00 73.68 66.67 0.70
✔ Saved pipeline to output directory
/content/model/output/model-last
index_selected_k_most_informative_documents : {202: 0.1586983323890392, 88: 0.18991682159109857, 84: 0.19565444310182115, 172: 0.20040347767343344, 78: 0.2049782126110353, 236: 0.20923753382027174, 111: 0.21172295024438473, 54: 0.21987526527066636, 114: 0.22812226993688478, 212: 0.2429483873771859, 165: 0.25431208587777293, 55: 0.25711627274050625, 186: 0.2578904935852005, 118: 0.2733757439610342, 83: 0.27877870089441625, 228: 0.2866323902366378, 124: 0.2882972092524785, 218: 0.2888313589428746, 196: 0.2892895183606026, 160: 0.2929971692417957, 36: 0.3039209393249116, 45: 0.317380786937083, 204: 0.3210188284673752, 149: 0.32142007621087004, 32: 0.32399186786160844}
length predicted docs : 25
most informative docs : 25
update unlabeled docs :
before update : 221
index_selected_k_most_informative_documents : [202, 88, 84, 172, 78, 236, 111, 54, 114, 212, 165, 55, 186, 118, 83, 228, 124, 218, 196, 160, 36, 45, 204, 149, 32]
length index_selected_k_most_informative_documents : 25
length index_selected_k_most_informative_documents : 25
after update : 196
index batch : 1
epoch : 0
epoch : 1
epoch : 2
epoch : 3
epoch : 4
epoch : 5
epoch : 6
epoch : 7
epoch : 8
epoch : 9
f_score : 0.6815415821501014
ℹ Saving to output directory: /content/model/output
ℹ Using CPU
ℹ To switch to GPU 0, use the option: --gpu-id 0

=========================== Initializing pipeline ===========================
[2023-07-13 09:28:55,197] [WARNING] [W112] The model specified to use for initial vectors (en_core_web_sm) has no vectors. This is almost certainly a mistake.
WARNING:spacy:[W112] The model specified to use for initial vectors (en_core_web_sm) has no vectors. This is almost certainly a mistake.
✔ Initialized pipeline

============================= Training pipeline =============================
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.001
E # LOSS TOK2VEC LOSS NER ENTS_F ENTS_P ENTS_R SCORE


ERROR:main:Failed to execute task: [E203] If the tok2vec embedding layer is not updated during training, make sure to include it in 'annotating components'
Traceback (most recent call last):
File "", line 179, in handler
model_path = train_model_spacy(training_data_path=output_moodel_path,
File "", line 77, in train_model_spacy
train(config_path, output_path=model_path, overrides={"paths.train": training_data_path+"/training.spacy", "paths.dev": training_data_path+"/testing.spacy"}
File "/usr/local/lib/python3.10/dist-packages/spacy/cli/train.py", line 75, in train
train_nlp(nlp, output_path, use_gpu=use_gpu, stdout=sys.stdout, stderr=sys.stderr)
File "/usr/local/lib/python3.10/dist-packages/spacy/training/loop.py", line 124, in train
raise e
File "/usr/local/lib/python3.10/dist-packages/spacy/training/loop.py", line 107, in train
for batch, info, is_best_checkpoint in training_step_iterator:
File "/usr/local/lib/python3.10/dist-packages/spacy/training/loop.py", line 209, in train_while_improving
nlp.update(
File "/usr/local/lib/python3.10/dist-packages/spacy/language.py", line 1163, in update
proc.update(examples, sgd=None, losses=losses, **component_cfg[name]) # type: ignore
File "spacy/pipeline/transition_parser.pyx", line 405, in spacy.pipeline.transition_parser.Parser.update
File "/usr/local/lib/python3.10/dist-packages/thinc/model.py", line 309, in begin_update
return self._func(self, X, is_train=True)
File "/usr/local/lib/python3.10/dist-packages/spacy/ml/tb_framework.py", line 33, in forward
step_model = ParserStepModel(
File "spacy/ml/parser_model.pyx", line 213, in spacy.ml.parser_model.ParserStepModel.init
File "/usr/local/lib/python3.10/dist-packages/thinc/model.py", line 291, in call
return self._func(self, X, is_train=is_train)
File "/usr/local/lib/python3.10/dist-packages/thinc/layers/chain.py", line 55, in forward
Y, inc_layer_grad = layer(X, is_train=is_train)
File "/usr/local/lib/python3.10/dist-packages/thinc/model.py", line 291, in call
return self._func(self, X, is_train=is_train)
File "/usr/local/lib/python3.10/dist-packages/spacy/pipeline/tok2vec.py", line 292, in forward
raise ValueError(Errors.E203.format(name="tok2vec"))
ValueError: [E203] If the tok2vec embedding layer is not updated during training, make sure to include it in 'annotating components'
⚠ Aborting and saving the final best model. Encountered exception:
ValueError("[E203] If the tok2vec embedding layer is not updated during
training, make sure to include it in 'annotating components'")

Metadata

Metadata

Assignees

No one assigned

    Labels

    feat / tok2vecFeature: Token-to-vector layer and pretraining

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions