Skip to content

Commit 9267fdd

Browse files
committed
fix document split bug
Signed-off-by: Tim Schopf <[email protected]>
1 parent 5e28116 commit 9267fdd

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

keyphrase_vectorizers/keyphrase_vectorizer_mixin.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -442,7 +442,7 @@ def _get_pos_keyphrases(self, document_list: List[str], stop_words: Union[str, L
442442
stop_words_list.add(doc_delimiter)
443443

444444
# split processed documents by delimiter
445-
processed_docs = list(filter(None, [doc.strip() for doc in processed_docs.split(doc_delimiter)]))
445+
processed_docs = [doc.strip() for doc in processed_docs.split(doc_delimiter)][1:]
446446

447447
if extract_keyphrases:
448448
# extract keyphrases that match the NLTK RegexpParser filter

0 commit comments

Comments
 (0)