-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
I tried out using seeded topic modelling, using pre-embedded documents:
model = KeyNMF(10, top_n=15, encoder=trf, vectorizer=CountVectorizer(stop_words=stop_words, min_df=50, max_df=0.85), seed_phrase='bog teater literatur')
model.fit(cleaned_corpus, embeddings=embeddings_matrix)However, this gave me the same results as when I left out the seed_phrase:
model = KeyNMF(10, top_n=15, encoder=trf, vectorizer=CountVectorizer(stop_words=stop_words, min_df=50, max_df=0.85))
model.fit(cleaned_corpus, embeddings=embeddings_matrix)Is it possible that the seed_phrase gets overwritten as soon as embeddings is defined?
EDIT: when I leave out embeddings=embeddings_matrix, I do get a different result. (But I want to load my pre-embedded documents.)
Metadata
Metadata
Assignees
Labels
No labels