@@ -111,6 +111,20 @@ topic_model.visualize_topics()
111111
112112<img src =" images/topic_visualization.gif " width =" 60% " height =" 60% " align =" center " />
113113
114+ We can create an overview of the most frequent topics in a way that they are easily interpretable.
115+ Horizontal barcharts typically convey information rather well and allow for an intuitive representation
116+ of the topics:
117+
118+ ``` python
119+ topic_model.visualize_barchart()
120+ ```
121+
122+ <img src =" images/topics.png " width =" 70% " height =" 70% " align =" center " />
123+
124+
125+ Find all possible visualizations with interactive examples in the documentation
126+ [ here] ( https://maartengr.github.io/BERTopic/tutorial/visualization/visualization.html ) .
127+
114128## Embedding Models
115129BERTopic supports many embedding models that can be used to embed the documents and words:
116130* Sentence-Transformers
@@ -119,12 +133,12 @@ BERTopic supports many embedding models that can be used to embed the documents
119133* Gensim
120134* USE
121135
122- [ ** Sentence-Transformers** ] ( ) is typically used as it has shown great results embedding documents
136+ [ ** Sentence-Transformers** ] ( https://github.com/UKPLab/sentence-transformers ) is typically used as it has shown great results embedding documents
123137meant for semantic similarity. Simply select any from their documentation
124138[ here] ( https://www.sbert.net/docs/pretrained_models.html ) and pass it to BERTopic:
125139
126140``` python
127- topic_model = BERTopic(embedding_model = " paraphrase -MiniLM-L6-v2" )
141+ topic_model = BERTopic(embedding_model = " all -MiniLM-L6-v2" )
128142```
129143
130144[ ** Flair** ] ( https://github.com/flairNLP/flair ) allows you to choose almost any 🤗 transformers model. Simply
@@ -185,34 +199,35 @@ For quick access to common functions, here is an overview of BERTopic's main met
185199
186200| Method | Code |
187201| -----------------------| ---|
188- | Fit the model | ` BERTopic().fit(docs) ` |
189- | Fit the model and predict documents | ` BERTopic().fit_transform(docs) ` |
190- | Predict new documents | ` BERTopic().transform([new_doc]) ` |
191- | Access single topic | ` BERTopic().get_topic(topic=12) ` |
192- | Access all topics | ` BERTopic().get_topics() ` |
193- | Get topic freq | ` BERTopic().get_topic_freq() ` |
194- | Get all topic information| ` BERTopic().get_topic_info() ` |
195- | Get topics per class | ` BERTopic().topics_per_class(docs, topics, classes) ` |
196- | Dynamic Topic Modeling | ` BERTopic().topics_over_time(docs, topics, timestamps) ` |
197- | Update topic representation | ` BERTopic().update_topics(docs, topics, n_gram_range=(1, 3)) ` |
198- | Reduce nr of topics | ` BERTopic().reduce_topics(docs, topics, nr_topics=30) ` |
199- | Find topics | ` BERTopic().find_topics("vehicle") ` |
200- | Save model | ` BERTopic().save("my_model") ` |
202+ | Fit the model | ` .fit(docs) ` |
203+ | Fit the model and predict documents | ` .fit_transform(docs) ` |
204+ | Predict new documents | ` .transform([new_doc]) ` |
205+ | Access single topic | ` .get_topic(topic=12) ` |
206+ | Access all topics | ` .get_topics() ` |
207+ | Get topic freq | ` .get_topic_freq() ` |
208+ | Get all topic information| ` .get_topic_info() ` |
209+ | Get representative docs per topic | ` .get_representative_docs() ` |
210+ | Get topics per class | ` .topics_per_class(docs, topics, classes) ` |
211+ | Dynamic Topic Modeling | ` .topics_over_time(docs, topics, timestamps) ` |
212+ | Update topic representation | ` .update_topics(docs, topics, n_gram_range=(1, 3)) ` |
213+ | Reduce nr of topics | ` .reduce_topics(docs, topics, nr_topics=30) ` |
214+ | Find topics | ` .find_topics("vehicle") ` |
215+ | Save model | ` .save("my_model") ` |
201216| Load model | ` BERTopic.load("my_model") ` |
202- | Get parameters | ` BERTopic() .get_params()` |
217+ | Get parameters | ` .get_params() ` |
203218
204219For an overview of BERTopic's visualization methods:
205220
206221| Method | Code |
207222| -----------------------| ---|
208- | Visualize Topics | ` BERTopic() .visualize_topics()` |
209- | Visualize Topic Hierarchy | ` BERTopic() .visualize_hierarchy()` |
210- | Visualize Topic Terms | ` BERTopic() .visualize_barchart()` |
211- | Visualize Topic Similarity | ` BERTopic() .visualize_heatmap()` |
212- | Visualize Term Score Decline | ` BERTopic() .visualize_term_rank()` |
213- | Visualize Topic Probability Distribution | ` BERTopic() .visualize_distribution(probs[0])` |
214- | Visualize Topics over Time | ` BERTopic() .visualize_topics_over_time(topics_over_time)` |
215- | Visualize Topics per Class | ` BERTopic() .visualize_topics_per_class(topics_per_class)` |
223+ | Visualize Topics | ` .visualize_topics() ` |
224+ | Visualize Topic Hierarchy | ` .visualize_hierarchy() ` |
225+ | Visualize Topic Terms | ` .visualize_barchart() ` |
226+ | Visualize Topic Similarity | ` .visualize_heatmap() ` |
227+ | Visualize Term Score Decline | ` .visualize_term_rank() ` |
228+ | Visualize Topic Probability Distribution | ` .visualize_distribution(probs[0]) ` |
229+ | Visualize Topics over Time | ` .visualize_topics_over_time(topics_over_time) ` |
230+ | Visualize Topics per Class | ` .visualize_topics_per_class(topics_per_class) ` |
216231
217232## Citation
218233To cite BERTopic in your work, please use the following bibtex reference:
@@ -223,7 +238,7 @@ To cite BERTopic in your work, please use the following bibtex reference:
223238 title = {BERTopic: Leveraging BERT and c-TF-IDF to create easily interpretable topics.},
224239 year = 2020,
225240 publisher = {Zenodo},
226- version = {v0.7.0 },
241+ version = {v0.9.2 },
227242 doi = {10.5281/zenodo.4381785},
228243 url = {https://doi.org/10.5281/zenodo.4381785}
229244}
0 commit comments