Skip to content

Commit 8ccbab7

Browse files
author
Maarten Grootendorst
authored
v0.11.0 (#578)
* Perform hierarchical topic modeling with `.hierarchical_topics` * Visualize hierarchical topic representations with `.visualize_hierarchy` * Extract a text-based hierarchical topic representation with `.get_topic_tree` * Visualize 2D documents with `.visualize_documents()` * Visualize 2D hierarchical documents with `.visualize_hierarchical_documents()` * Create custom labels to the topics throughout most visualizations with `.generate_topic_labels` and `.set_topic_labels` * Manually merge topics with `.merge_topics()` * Added example for finding similar topics between two models in the tips & tricks page * Add multi-modal example in the tips & tricks page * Added native Hugging Face transformers support
1 parent 63fd2a2 commit 8ccbab7

35 files changed

+2878
-103
lines changed

README.md

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,10 @@ topic_model.visualize_topics_over_time(topics_over_time, top_n_topics=6)
196196
<img src="images/dtm.gif" width="80%" height="80%" align="center" />
197197

198198
## Overview
199+
BERTopic has quite a number of functions that quickly can become overwhelming. To alleviate this issue, you will find an overview
200+
of all methods and a short description of its purpose.
201+
202+
### Common
199203
For quick access to common functions, here is an overview of BERTopic's main methods:
200204

201205
| Method | Code |
@@ -208,21 +212,40 @@ For quick access to common functions, here is an overview of BERTopic's main met
208212
| Get topic freq | `.get_topic_freq()` |
209213
| Get all topic information| `.get_topic_info()` |
210214
| Get representative docs per topic | `.get_representative_docs()` |
211-
| Get topics per class | `.topics_per_class(docs, topics, classes)` |
212-
| Dynamic Topic Modeling | `.topics_over_time(docs, topics, timestamps)` |
213215
| Update topic representation | `.update_topics(docs, topics, n_gram_range=(1, 3))` |
216+
| Generate topic labels | `.generate_topic_labels()` |
217+
| Set topic labels | `.set_topic_labels(my_custom_labels)` |
218+
| Merge topics | `.merge_topics(docs, topics, topics_to_merge)` |
214219
| Reduce nr of topics | `.reduce_topics(docs, topics, nr_topics=30)` |
215220
| Find topics | `.find_topics("vehicle")` |
216221
| Save model | `.save("my_model")` |
217222
| Load model | `BERTopic.load("my_model")` |
218223
| Get parameters | `.get_params()` |
219224

220-
For an overview of BERTopic's visualization methods:
225+
### Variations
226+
There are many different use cases in which topic modeling can be used. As such, a number of
227+
variations of BERTopic have been developed such that one package can be used across across many use cases:
228+
229+
| Method | Code |
230+
|-----------------------|---|
231+
| (semi-) Supervised Topic Modeling | `.fit(docs, y=y)` |
232+
| Topic Modeling per Class | `.topics_per_class(docs, topics, classes)` |
233+
| Dynamic Topic Modeling | `.topics_over_time(docs, topics, timestamps)` |
234+
| Hierarchical Topic Modeling | `.hierarchical_topics(docs, topics)` |
235+
| Guided Topic Modeling | `BERTopic(seed_topic_list=seed_topic_list)` |
236+
237+
### Visualizations
238+
Evaluating topic models can be rather difficult due to the somewhat subjective nature of evaluation.
239+
Visualizing different aspects of the topic model helps in understanding the model and makes it easier
240+
to tweak the model to your liking.
221241

222242
| Method | Code |
223243
|-----------------------|---|
224244
| Visualize Topics | `.visualize_topics()` |
245+
| Visualize Documents | `.visualize_documents()` |
246+
| Visualize Document Hierarchy | `.visualize_hierarchical_documents()` |
225247
| Visualize Topic Hierarchy | `.visualize_hierarchy()` |
248+
| Visualize Topic Tree | `.get_topic_tree(hierarchical_topics)` |
226249
| Visualize Topic Terms | `.visualize_barchart()` |
227250
| Visualize Topic Similarity | `.visualize_heatmap()` |
228251
| Visualize Term Score Decline | `.visualize_term_rank()` |

bertopic/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
from bertopic._bertopic import BERTopic
22

3-
__version__ = "0.10.0"
3+
__version__ = "0.11.0"
44

55
__all__ = [
66
"BERTopic",

0 commit comments

Comments
 (0)