ValueError("[E203] If the tok2vec embedding layer is not updated during training, make sure to include it in 'annotating components'")

after the second batch of my data , i m getting this error :
 Aborting and saving the final best model. Encountered exception:
ValueError("[E203] If the tok2vec embedding layer is not updated during
training, make sure to include it in 'annotating components'")

my output :
Error: /content/model - No such file or directory.
index batch  : 0
epoch : 0
epoch : 1
epoch : 2
epoch : 3
epoch : 4
epoch : 5
epoch : 6
epoch : 7
epoch : 8
epoch : 9
f_score : 0.6586826347305389
✔ Created output directory: /content/model/output
ℹ Saving to output directory: /content/model/output
ℹ Using CPU
ℹ To switch to GPU 0, use the option: --gpu-id 0

=========================== Initializing pipeline ===========================
[2023-07-13 09:28:30,226] [WARNING] [W112] The model specified to use for initial vectors (en_core_web_sm) has no vectors. This is almost certainly a mistake.
WARNING:spacy:[W112] The model specified to use for initial vectors (en_core_web_sm) has no vectors. This is almost certainly a mistake.
✔ Initialized pipeline

============================= Training pipeline =============================
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.001
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     69.53   10.17    7.89   14.29    0.10
 16      50         53.74   2167.21   54.55   75.00   42.86    0.55
 33     100         10.56    192.28   65.00   68.42   61.90    0.65
 50     150          0.00      0.01   70.00   73.68   66.67    0.70
✔ Saved pipeline to output directory
/content/model/output/model-last
index_selected_k_most_informative_documents : {202: 0.1586983323890392, 88: 0.18991682159109857, 84: 0.19565444310182115, 172: 0.20040347767343344, 78: 0.2049782126110353, 236: 0.20923753382027174, 111: 0.21172295024438473, 54: 0.21987526527066636, 114: 0.22812226993688478, 212: 0.2429483873771859, 165: 0.25431208587777293, 55: 0.25711627274050625, 186: 0.2578904935852005, 118: 0.2733757439610342, 83: 0.27877870089441625, 228: 0.2866323902366378, 124: 0.2882972092524785, 218: 0.2888313589428746, 196: 0.2892895183606026, 160: 0.2929971692417957, 36: 0.3039209393249116, 45: 0.317380786937083, 204: 0.3210188284673752, 149: 0.32142007621087004, 32: 0.32399186786160844}
length predicted docs : 25
most informative docs : 25
update unlabeled docs :
before update : 221
index_selected_k_most_informative_documents : [202, 88, 84, 172, 78, 236, 111, 54, 114, 212, 165, 55, 186, 118, 83, 228, 124, 218, 196, 160, 36, 45, 204, 149, 32]
length index_selected_k_most_informative_documents : 25
length index_selected_k_most_informative_documents : 25
after update : 196
index batch  : 1
epoch : 0
epoch : 1
epoch : 2
epoch : 3
epoch : 4
epoch : 5
epoch : 6
epoch : 7
epoch : 8
epoch : 9
f_score : 0.6815415821501014
ℹ Saving to output directory: /content/model/output
ℹ Using CPU
ℹ To switch to GPU 0, use the option: --gpu-id 0

=========================== Initializing pipeline ===========================
[2023-07-13 09:28:55,197] [WARNING] [W112] The model specified to use for initial vectors (en_core_web_sm) has no vectors. This is almost certainly a mistake.
WARNING:spacy:[W112] The model specified to use for initial vectors (en_core_web_sm) has no vectors. This is almost certainly a mistake.
✔ Initialized pipeline

============================= Training pipeline =============================
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.001
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
ERROR:__main__:Failed to execute task: [E203] If the tok2vec embedding layer is not updated during training, make sure to include it in 'annotating components'
Traceback (most recent call last):
  File "<ipython-input-5-53c25b46120f>", line 179, in handler
    model_path = train_model_spacy(training_data_path=output_moodel_path,
  File "<ipython-input-5-53c25b46120f>", line 77, in train_model_spacy
    train(config_path, output_path=model_path, overrides={"paths.train": training_data_path+"/training.spacy", "paths.dev": training_data_path+"/testing.spacy"}
  File "/usr/local/lib/python3.10/dist-packages/spacy/cli/train.py", line 75, in train
    train_nlp(nlp, output_path, use_gpu=use_gpu, stdout=sys.stdout, stderr=sys.stderr)
  File "/usr/local/lib/python3.10/dist-packages/spacy/training/loop.py", line 124, in train
    raise e
  File "/usr/local/lib/python3.10/dist-packages/spacy/training/loop.py", line 107, in train
    for batch, info, is_best_checkpoint in training_step_iterator:
  File "/usr/local/lib/python3.10/dist-packages/spacy/training/loop.py", line 209, in train_while_improving
    nlp.update(
  File "/usr/local/lib/python3.10/dist-packages/spacy/language.py", line 1163, in update
    proc.update(examples, sgd=None, losses=losses, **component_cfg[name])  # type: ignore
  File "spacy/pipeline/transition_parser.pyx", line 405, in spacy.pipeline.transition_parser.Parser.update
  File "/usr/local/lib/python3.10/dist-packages/thinc/model.py", line 309, in begin_update
    return self._func(self, X, is_train=True)
  File "/usr/local/lib/python3.10/dist-packages/spacy/ml/tb_framework.py", line 33, in forward
    step_model = ParserStepModel(
  File "spacy/ml/parser_model.pyx", line 213, in spacy.ml.parser_model.ParserStepModel.__init__
  File "/usr/local/lib/python3.10/dist-packages/thinc/model.py", line 291, in __call__
    return self._func(self, X, is_train=is_train)
  File "/usr/local/lib/python3.10/dist-packages/thinc/layers/chain.py", line 55, in forward
    Y, inc_layer_grad = layer(X, is_train=is_train)
  File "/usr/local/lib/python3.10/dist-packages/thinc/model.py", line 291, in __call__
    return self._func(self, X, is_train=is_train)
  File "/usr/local/lib/python3.10/dist-packages/spacy/pipeline/tok2vec.py", line 292, in forward
    raise ValueError(Errors.E203.format(name="tok2vec"))
ValueError: [E203] If the tok2vec embedding layer is not updated during training, make sure to include it in 'annotating components'
⚠ Aborting and saving the final best model. Encountered exception:
ValueError("[E203] If the tok2vec embedding layer is not updated during
training, make sure to include it in 'annotating components'")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ValueError("[E203] If the tok2vec embedding layer is not updated during training, make sure to include it in 'annotating components'") #12821

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ValueError("[E203] If the tok2vec embedding layer is not updated during training, make sure to include it in 'annotating components'") #12821

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions