During prediction batch seems to miss 'src_token_ids' key

I've tried to parse our test set of Spanish AMR with your xl-amr cross-lingual parser with this command:

 python -u -m xlamr_stog.commands.predict --archive-file C:/Users/user/Documents/AMR/SpanishAMR/xl-amr/models/xl-amr_bilingual_en_es_trans_amr --weights-file C:/Users/user/Documents/AMR/SpanishAMR/xl-amr/models/xl-amr_bilingual_en_es_trans_amr/best.th --input-file C:/Users/user/Documents/AMR/SpanishAMR/Training_t5wtense/test_es.txt.features.input_clean.recategorize --batch-size 32 --use-dataset-reader --output-file C:/Users/user/Documents/AMR/SpanishAMR/Training_t5wtense/test_output.txt --silent --beam-size 5 --predictor STOG

Unfortunately, I've got this Error:
Original exception was:
Traceback (most recent call last):
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\commands\predict.py", line 275, in <module>
    _predict(args)
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\commands\predict.py", line 227, in _predict
    manager.run()
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\commands\predict.py", line 200, in run
    for model_input_instance, result in zip(batch, self._predict_instances(batch)):
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\commands\predict.py", line 158, in _predict_instances
    results, encoder_last_state_seq = self._predictor.predict_batch_instance(batch_data)
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\predictors\stog.py", line 39, in predict_batch_instance
    _outputs, encoder_last_state_seq = super(STOGPredictor, self).predict_batch_instance(instances)
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\predictors\predictor.py", line 62, in predict_batch_instance
    outputs, encoder_last_state_seq = self._model.forward_on_instances(instances)
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\models\model.py", line 149, in forward_on_instances
    encoder_outputs = self(model_input)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\models\stog.py", line 350, in forward
    encoder_outputs = self.encode(
  File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\models\stog.py", line 437, in encode
    bert_mask = bert_tokens.ne(0)
AttributeError: 'NoneType' object has no attribute 'ne'

It seems, that the bert_tokens are of type None, because the Batch tensor dicitonary is missing the "src_token_ids" key:
def prepare_batch_input(self, batch):
        # [batch, num_tokens]
        **bert_token_inputs = batch.get('src_token_ids', None)**
        if bert_token_inputs is not None:
            bert_token_inputs = bert_token_inputs.long()
        encoder_token_subword_index = batch.get('src_token_subword_index', None)
        if encoder_token_subword_index is not None:
            encoder_token_subword_index = encoder_token_subword_index.long()
        encoder_token_inputs = batch['src_tokens']['encoder_tokens']
        encoder_pos_tags = batch['src_pos_tags']
        encoder_must_copy_tags = batch['src_must_copy_tags']
        # [batch, num_tokens, num_chars]
        encoder_char_inputs = batch['src_tokens']['encoder_characters']
        # [batch, num_tokens]
        encoder_mask = get_text_field_mask(batch['src_tokens'])

How could I fix that?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

During prediction batch seems to miss 'src_token_ids' key #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

During prediction batch seems to miss 'src_token_ids' key #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions