You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|Models for all Scenarios |:chart_with_upwards_trend:[Dynamic](https://x-tabdeveloping.github.io/turftopic/dynamic/), :ocean:[Online](https://x-tabdeveloping.github.io/turftopic/online/), :herb:[Seeded](https://x-tabdeveloping.github.io/turftopic/seeded/), and :evergreen_tree:[Hierarchical](https://x-tabdeveloping.github.io/turftopic/hierarchical/) topic modeling|
Copy file name to clipboardExpand all lines: docs/dynamic.md
+43-45Lines changed: 43 additions & 45 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,13 @@
2
2
3
3
If you want to examine the evolution of topics over time, you will need a dynamic topic model.
4
4
5
-
> Note that regular static models can also be used to study the evolution of topics and information dynamics, but they can't capture changes in the topics themselves.
5
+
> You will need to install Plotly for plotting to work.
6
6
7
-
## Models
7
+
```bash
8
+
pip install plotly
9
+
```
8
10
9
-
In Turftopic you can currently use three different topic models for modeling topics over time:
11
+
You can currently use three different topic models for modeling topics over time:
10
12
11
13
1.[ClusteringTopicModel](clustering.md), where an overall model is fitted on the whole corpus, and then term importances are estimated over time slices.
12
14
2.[GMM](GMM.md), similarly to clustering models, term importances are reestimated per time slice
@@ -33,50 +35,46 @@ model = KeyNMF(5, top_n=5, random_state=42)
Copy file name to clipboardExpand all lines: docs/hierarchical.md
+61-45Lines changed: 61 additions & 45 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,16 +1,27 @@
1
1
# Hierarchical Topic Modeling
2
2
3
-
> Note: Hierarchical topic modeling in Turftopic is still in its early stages, you can expect more visualization utilities, tools and models in the future :sparkles:
4
-
5
3
You might expect some topics in your corpus to belong to a hierarchy of topics.
6
-
Some models in Turftopic (currently only [KeyNMF](KeyNMF.md)) allow you to investigate hierarchical relations and build a taxonomy of topics in a corpus.
4
+
Some models in Turftopic allow you to investigate hierarchical relations and build a taxonomy of topics in a corpus.
5
+
6
+
Models in Turftopic that can model hierarchical relations will have a `hierarchy` property, that you can manipulate and print/visualize:
7
+
8
+
```python
9
+
from turftopic import ClusteringTopicModel
10
+
11
+
model = ClusteringTopicModel(n_reduce_to=10).fit(corpus)
12
+
# We cut at level 3 for plotting, since the hierarchy is very deep
13
+
model.hierarchy.cut(3).plot_tree()
14
+
```
15
+
16
+
_Drag and click to zoom, hover to see word importance_
17
+
18
+
<iframesrc="../images/tree_plot.html",title="Topic hierarchy in a clustering model",style="height:800px;width:100%;padding:0px;border:none;"></iframe>
19
+
7
20
8
-
## Divisive Hierarchical Modeling
21
+
## 1. Divisive/Top-down Hierarchical Modeling
9
22
10
-
Currently Turftopic, in contrast with other topic modeling libraries only allows for hierarchical modeling in a divisive context.
11
-
This means that topics can be divided into subtopics in a **top-down** manner.
12
-
[KeyNMF](KeyNMF.md) does not discover a topic hierarchy automatically,
13
-
but you can manually instruct the model to find subtopics in larger topics.
23
+
In divisive modeling, you start from larger structures, higher up in the hierarchy, and divide topics into smaller sub-topics on-demand.
24
+
This is how hierarchical modeling works in [KeyNMF](keynmf.md), which, by default does not discover a topic hierarchy, but you can divide topics to as many subtopics as you see fit.
14
25
15
26
As a demonstration, let's load a corpus, that we know to have hierarchical themes.
In other models, hierarchies arise from starting from smaller, more specific topics, and then merging them together based on their similarity until a desired number of top-level topics are obtained.
121
+
122
+
This is how it is done in [clustering topic models](clustering.md) like BERTopic and Top2Vec.
123
+
Clustering models typically find a lot of topics, and it can help with interpretation to merge topics until you gain 10-20 top-level topics.
124
+
125
+
You can either do this by default on a clustering model by setting `n_reduce_to` on initialization or you can do it manually with `reduce_topics()`.
126
+
For more details, check our guide on [Clustering models](clustering.md).
0 commit comments