You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keyphrase_vectorizers/keyphrase_count_vectorizer.py
+10-6Lines changed: 10 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,7 @@ class KeyphraseCountVectorizer(_KeyphraseVectorizerMixin, BaseEstimator):
39
39
must be customized accordingly.
40
40
Additionally, the ``pos_pattern`` parameter has to be customized as the `spaCy part-of-speech tags`_ differ between languages.
41
41
Without customizing, the words will be tagged with wrong part-of-speech tags and no stopwords will be considered.
42
-
In addition, you have to exclude/include different pipeline components using the ``spacy_exclude`` parameter for the spaCy POS tagger to work properly.
42
+
In addition, you may have to exclude/include different pipeline components using the ``spacy_exclude`` parameter for the spaCy POS tagger to work properly.
Copy file name to clipboardExpand all lines: keyphrase_vectorizers/keyphrase_tfidf_vectorizer.py
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@ class KeyphraseTfidfVectorizer(KeyphraseCountVectorizer):
37
37
must be customized accordingly.
38
38
Additionally, the ``pos_pattern`` parameter has to be customized as the `spaCy part-of-speech tags`_ differ between languages.
39
39
Without customizing, the words will be tagged with wrong part-of-speech tags and no stopwords will be considered.
40
-
In addition, you have to exclude/include different pipeline components using the ``spacy_exclude`` parameter for the spaCy POS tagger to work properly.
40
+
In addition, you may have to exclude/include different pipeline components using the ``spacy_exclude`` parameter for the spaCy POS tagger to work properly.
41
41
42
42
Tf means term-frequency while tf-idf means term-frequency times inverse document-frequency.
43
43
This is a common term weighting scheme in information retrieval,
0 commit comments