Skip to content
This repository was archived by the owner on Nov 8, 2022. It is now read-only.

bug: zipfile.BadZipFile using pretrained BIST model #217

@mastreips

Description

@mastreips

Describe the bug
A clear and concise description of what the bug is.
Model/procedure: what model or procedure were you running?

nlp_architect/models/absa/train/train.py produces zipfile.BadZipFile: File is not a zip file error when trying to download the pretrained model for SpacyBISTParser(). Updating spacy to 3.0 results in ImportError: cannot import name 'LEMMA_EXC' error as a result of a change from Spacy v2.1 to v2.2 to move the large lookup tables out of the main library. The lemmatizer data is now stored in the separate package spacy-lookups-data and the Lemmatizer is initialized with a Lookups object instead of the individual variables.since

To Reproduce
Steps to reproduce the behavior:

  1. pip_packages = ['nlp-architect','spacy==2.1.8','numpy==1.19.5']

Expected behavior

**Environment setup: **

  • OS (Linux/Mac OS): Azure AML
  • Python version: 3.6.9
  • Backend:

Additional context

Log Output

You can now load the model via spacy.load('en')
Using pre-trained BIST model.
Downloading pre-trained BIST model...
Unable to determine total file size.
Downloading file to: /root/nlp-architect/cache/bist-pretrained/bist-pretrained.zip

0MB [00:00, ?MB/s]
1MB [00:00, 579.96MB/s]
Download Complete
Unzipping...

[2021-04-05T14:57:17.529886] The experiment failed. Finalizing run...
2021-04-05 14:57:17,535 INFO Exiting context: TrackUserError
2021-04-05 14:57:17,536 INFO Exiting context: RunHistory
Cleaning up all outstanding Run operations, waiting 900.0 seconds
1 items cleaning up...
Cleanup took 0.07420921325683594 seconds
2021-04-05 14:57:30,901 INFO Exiting context: ProjectPythonPath
Traceback (most recent call last):
File "train.py", line 46, in
max_iter=args.max_iter)
File "/azureml-envs/azureml_d664de2764d55f1b5c7b6f4fc0a2fd6b/lib/python3.6/site-packages/nlp_architect/models/absa/train/train.py", line 49, in init
self.parser = SpacyBISTParser()
File "/azureml-envs/azureml_d664de2764d55f1b5c7b6f4fc0a2fd6b/lib/python3.6/site-packages/nlp_architect/pipelines/spacy_bist.py", line 46, in init
_download_pretrained_model()
File "/azureml-envs/azureml_d664de2764d55f1b5c7b6f4fc0a2fd6b/lib/python3.6/site-packages/nlp_architect/pipelines/spacy_bist.py", line 170, in _download_pretrained_model
uncompress_file(zip_path, outpath=str(SpacyBISTParser.dir))
File "/azureml-envs/azureml_d664de2764d55f1b5c7b6f4fc0a2fd6b/lib/python3.6/site-packages/nlp_architect/utils/io.py", line 85, in uncompress_file
with zipfile.ZipFile(filepath) as z:
File "/azureml-envs/azureml_d664de2764d55f1b5c7b6f4fc0a2fd6b/lib/python3.6/zipfile.py", line 1108, in init
self._RealGetContents()
File "/azureml-envs/azureml_d664de2764d55f1b5c7b6f4fc0a2fd6b/lib/python3.6/zipfile.py", line 1175, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions