Skip to content

Remove Gensim #1127

@ajdapretnar

Description

@ajdapretnar

Is your feature request related to a problem? Please describe.
Gensim apparently causes quite some issues on installation and in some widgets.
It is used is:

  1. Corpus and Preprocessing: gensim.corpora.Dictionary
  2. BOW: gensim.models.TfidfModel
  3. in util: Sparse2CorpusSliceable
  4. Topic Modeling: GensimWrapper
  5. Topic Modeling: CoherenceModel

Describe the solution you'd like

  1. Write own Dictionary (or copy from Gensim)
  2. Use sklearn.feature_extraction.text.TfidfTransformer
  3. Figure out Sparse2Corpus (minor)
  4. Use LDA and other decomposition methods from sklearn.
  5. Remove Coherence model. Apparently, there is no suitable alternative to run it with sklearn. I don't think it warrants keeping gensim just for this.

Describe alternatives you've considered
SpaCy could also be used for TM and similar.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions