MaartenGr
diff --git a/‎.gitattributes‎
Lines changed: 1 addition & 1 deletion b/‎.gitattributes‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎LICENSE‎
Lines changed: 1 addition & 1 deletion b/‎LICENSE‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎Makefile‎
Lines changed: 1 addition & 1 deletion b/‎Makefile‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md‎
Lines changed: 37 additions & 55 deletions b/‎README.md‎
Lines changed: 37 additions & 55 deletions
diff --git a/‎bertopic/__init__.py‎
Lines changed: 1 addition & 1 deletion b/‎bertopic/__init__.py‎
Lines changed: 1 addition & 1 deletion
@@ -1 +1 @@
-*.ipynb linguist-documentation
+*.ipynb linguist-documentation 
@@ -1,6 +1,6 @@
 MIT License
 
-Copyright (c) 2020, Maarten P. Grootendorst
+Copyright (c) 2022, Maarten P. Grootendorst
 
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 
@@ -6,7 +6,7 @@ install:
 
 install-test:
 	python -m pip install -e ".[test]"
-	python -m pip install -e ".[all]"
+	python -m pip install -e "."
 
 pypi:
 	python setup.py sdist
 
@@ -17,7 +17,8 @@ BERTopic supports
 [**guided**](https://maartengr.github.io/BERTopic/getting_started/guided/guided.html), 
 (semi-) [**supervised**](https://maartengr.github.io/BERTopic/getting_started/supervised/supervised.html), 
 [**hierarchical**](https://maartengr.github.io/BERTopic/getting_started/hierarchicaltopics/hierarchicaltopics.html), 
-and [**dynamic**](https://maartengr.github.io/BERTopic/getting_started/topicsovertime/topicsovertime.html) topic modeling. It even supports visualizations similar to LDAvis!
+[**dynamic**](https://maartengr.github.io/BERTopic/getting_started/topicsovertime/topicsovertime.html), and 
+[**online**](https://maartengr.github.io/BERTopic/getting_started/online/online.html) topic modeling. It even supports visualizations similar to LDAvis!
 
 Corresponding medium posts can be found [here](https://towardsdatascience.com/topic-modeling-with-bert-779f7db187e6?source=friends_link&sk=0b5a470c006d1842ad4c8a3057063a99) 
 and [here](https://towardsdatascience.com/interactive-topic-modeling-with-bertopic-1ea55e7d73d8?sk=03c2168e9e74b6bda2a1f3ed953427e4). For a more detailed overview, you can read the [paper](https://arxiv.org/abs/2203.05794). 
@@ -42,7 +43,7 @@ pip install bertopic[use]
 
 ## Getting Started
 For an in-depth overview of the features of BERTopic 
-you can check the full documentation [here](https://maartengr.github.io/BERTopic/) or you can follow along 
+you can check the [**full documentation**](https://maartengr.github.io/BERTopic/) or you can follow along 
 with one of the examples below:
 
 | Name  | Link  |
@@ -130,6 +131,7 @@ Find all possible visualizations with interactive examples in the documentation
 ## Embedding Models
 BERTopic supports many embedding models that can be used to embed the documents and words:
 * Sentence-Transformers
+* 🤗 Transformers
 * Flair
 * Spacy
 * Gensim
@@ -143,65 +145,24 @@ meant for semantic similarity. Simply select any from their documentation
 topic_model = BERTopic(embedding_model="all-MiniLM-L6-v2")
 ```
 
-[**Flair**](https://github.com/flairNLP/flair) allows you to choose almost any 🤗 transformers model. Simply 
-select any from [here](https://huggingface.co/models) and pass it to BERTopic:
+Similarly, you can choose any [**🤗 Transformers**](https://huggingface.co/models) model and pass it to BERTopic:
 
 ```python
-from flair.embeddings import TransformerDocumentEmbeddings
+from transformers.pipelines import pipeline
 
-roberta = TransformerDocumentEmbeddings('roberta-base')
-topic_model = BERTopic(embedding_model=roberta)
+embedding_model = pipeline("feature-extraction", model="distilbert-base-cased")
+topic_model = BERTopic(embedding_model=embedding_model)
 ```
 
 Click [here](https://maartengr.github.io/BERTopic/getting_started/embeddings/embeddings.html) 
 for a full overview of all supported embedding models. 
 
-## Dynamic Topic Modeling
-Dynamic topic modeling (DTM) is a collection of techniques aimed at analyzing the evolution of topics 
-over time. These methods allow you to understand how a topic is represented over time. 
-Here, we will be using all of Donald Trump's tweet to see how he talked over certain topics over time: 
-
-```python
-import re
-import pandas as pd
-
-trump = pd.read_csv('https://drive.google.com/uc?export=download&id=1xRKHaP-QwACMydlDnyFPEaFdtskJuBa6')
-trump.text = trump.apply(lambda row: re.sub(r"http\S+", "", row.text).lower(), 1)
-trump.text = trump.apply(lambda row: " ".join(filter(lambda x:x[0]!="@", row.text.split())), 1)
-trump.text = trump.apply(lambda row: " ".join(re.sub("[^a-zA-Z]+", " ", row.text).split()), 1)
-trump = trump.loc[(trump.isRetweet == "f") & (trump.text != ""), :]
-timestamps = trump.date.to_list()
-tweets = trump.text.to_list()
-```
-
-Then, we need to extract the global topic representations by simply creating and training a BERTopic model:
-
-```python
-topic_model = BERTopic(verbose=True)
-topics, probs = topic_model.fit_transform(tweets)
-```
-
-From these topics, we are going to generate the topic representations at each timestamp for each topic. We do this 
-by simply calling `topics_over_time` and pass in his tweets, the corresponding timestamps, and the related topics:
-
-```python
-topics_over_time = topic_model.topics_over_time(tweets, topics, timestamps, nr_bins=20)
-```
-
-Finally, we can visualize the topics by simply calling `visualize_topics_over_time()`: 
-
-```python
-topic_model.visualize_topics_over_time(topics_over_time, top_n_topics=6)
-```
-
-<img src="images/dtm.gif" width="80%" height="80%" align="center" />
-
 ## Overview
 BERTopic has quite a number of functions that quickly can become overwhelming. To alleviate this issue, you will find an overview 
 of all methods and a short description of its purpose. 
 
 ### Common
-For quick access to common functions, here is an overview of BERTopic's main methods:
+Below, you will find an overview of common functions in BERTopic. 
 
 | Method | Code  | 
 |-----------------------|---|
@@ -213,26 +174,46 @@ For quick access to common functions, here is an overview of BERTopic's main met
 | Get topic freq    |  `.get_topic_freq()` |
 | Get all topic information|  `.get_topic_info()` |
 | Get representative docs per topic |  `.get_representative_docs()` |
-| Update topic representation | `.update_topics(docs, topics, n_gram_range=(1, 3))` |
+| Update topic representation | `.update_topics(docs, n_gram_range=(1, 3))` |
 | Generate topic labels | `.generate_topic_labels()` |
 | Set topic labels | `.set_topic_labels(my_custom_labels)` |
-| Merge topics | `.merge_topics(docs, topics, topics_to_merge)` |
-| Reduce nr of topics | `.reduce_topics(docs, topics, nr_topics=30)` |
+| Merge topics | `.merge_topics(docs, topics_to_merge)` |
+| Reduce nr of topics | `.reduce_topics(docs, nr_topics=30)` |
 | Find topics | `.find_topics("vehicle")` |
 | Save model    |  `.save("my_model")` |
 | Load model    |  `BERTopic.load("my_model")` |
 | Get parameters |  `.get_params()` |
 
+
+### Attributes
+After having trained your BERTopic model, a number of attributes are saved within your model. These attributes, in part, 
+refer to how model information is stored on an estimator during fitting. The attributes that you see below all end in `_` and are 
+public attributes that can be used to access model information. 
+
+| Attribute | Description |
+|------------------------|---------------------------------------------------------------------------------------------|
+| topics_               | The topics that are generated for each document after training or updating the topic model. |
+| probabilities_ | The probabilities that are generated for each document if HDBSCAN is used. |
+| topic_sizes_           | The size of each topic                                                                      |
+| topic_mapper_          | A class for tracking topics and their mappings anytime they are merged/reduced.             |
+| topic_representations_ | The top *n* terms per topic and their respective c-TF-IDF values.                             |
+| c_tf_idf_              | The topic-term matrix as calculated through c-TF-IDF.                                       |
+| topic_labels_          | The default labels for each topic.                                                          |
+| custom_labels_         | Custom labels for each topic as generated through `.set_topic_labels`.                                                               |
+| topic_embeddings_      | The embeddings for each topic if `embedding_model` was used.                                                              |
+| representative_docs_   | The representative documents for each topic if HDBSCAN is used.                                                |
+
+
 ### Variations
 There are many different use cases in which topic modeling can be used. As such, a number of 
-variations of BERTopic have been developed such that one package can be used across across many use cases:
+variations of BERTopic have been developed such that one package can be used across across many use cases.
 
 | Method | Code  | 
 |-----------------------|---|
 | (semi-) Supervised Topic Modeling | `.fit(docs, y=y)` |
-| Topic Modeling per Class | `.topics_per_class(docs, topics, classes)` |
-| Dynamic Topic Modeling | `.topics_over_time(docs, topics, timestamps)` |
-| Hierarchical Topic Modeling | `.hierarchical_topics(docs, topics)` |
+| Topic Modeling per Class | `.topics_per_class(docs, classes)` |
+| Dynamic Topic Modeling | `.topics_over_time(docs, timestamps)` |
+| Hierarchical Topic Modeling | `.hierarchical_topics(docs)` |
 | Guided Topic Modeling | `BERTopic(seed_topic_list=seed_topic_list)` |
 
 ### Visualizations
@@ -254,6 +235,7 @@ to tweak the model to your liking.
 | Visualize Topics over Time   |  `.visualize_topics_over_time(topics_over_time)` |
 | Visualize Topics per Class | `.visualize_topics_per_class(topics_per_class)` | 
 
+
 ## Citation
 To cite the [BERTopic paper](https://arxiv.org/abs/2203.05794), please use the following bibtex reference:
 
 
@@ -1,6 +1,6 @@
 from bertopic._bertopic import BERTopic
 
-__version__ = "0.11.0"
+__version__ = "0.12.0"
 
 __all__ = [
     "BERTopic",
Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-*.ipynb linguist-documentation`
	`1`	`+*.ipynb linguist-documentation`