Skip to content

Conversation

@ajdapretnar
Copy link
Collaborator

Issue

Implements #596.

Description of changes

Add Spacy, first as a POS tagger because it is most sorely missed.

Later: implement Spacy for NER (also sorely missed). And for other NLP tasks (solo Spacy preprocessor).

Includes
  • Code changes
  • Tests
  • Documentation

@ajdapretnar
Copy link
Collaborator Author

The only thing left is to discuss the problem of additional dependencies in certain models (Chinese, Japanese, Russian and Ukrainian). Remove or somehow gracefully handle?

@VesnaT
Copy link
Contributor

VesnaT commented Jul 19, 2024

I get this, if the model is not installed.
image

def __getitem__(self, language: str) -> str:
model = find_model(language)
if model not in self.installed_models:
download(model)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.installed_models should be updated at this point. If not the package keeps getting downloaded.

@ajdapretnar
Copy link
Collaborator Author

So, the downloaded models are indeed packages. We have to warn the user that selecting a given language will install additional dependencies to the Orange environment (think about the wording).

@ajdapretnar ajdapretnar marked this pull request as draft August 29, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants