1414
1515The core estimation code is based on the `onlineldavb.py script
1616<https://github.com/blei-lab/onlineldavb/blob/master/onlineldavb.py>`_, by
17- `Matthew D. Hoffman, David M. Blei, Francis Bach:
18- Online Learning for Latent Dirichlet Allocation, NIPS 2010
19- <https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf>`_.
17+ Matthew D. Hoffman, David M. Blei, Francis Bach:
18+ `'Online Learning for Latent Dirichlet Allocation', NIPS 2010`_.
19+
20+ .. _'Online Learning for Latent Dirichlet Allocation', NIPS 2010: online-lda_
21+ .. _'Online Learning for LDA' by Hoffman et al.: online-lda_
22+ .. _online-lda: https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf
2023
2124The algorithm:
2225
@@ -199,8 +202,7 @@ def blend(self, rhot, other, targetsize=None):
199202
200203 The number of documents is stretched in both state objects, so that they are of comparable magnitude.
201204 This procedure corresponds to the stochastic gradient update from
202- `Hoffman et al. :"Online Learning for Latent Dirichlet Allocation"
203- <https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf>`_, see equations (5) and (9).
205+ `'Online Learning for LDA' by Hoffman et al.`_, see equations (5) and (9).
204206
205207 Parameters
206208 ----------
@@ -312,9 +314,7 @@ def load(cls, fname, *args, **kwargs):
312314
313315
314316class LdaModel (interfaces .TransformationABC , basemodel .BaseTopicModel ):
315- """Train and use Online Latent Dirichlet Allocation models as presented in
316- `Hoffman et al. :"Online Learning for Latent Dirichlet Allocation"
317- <https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf>`_.
317+ """Train and use Online Latent Dirichlet Allocation models as presented in `'Online Learning for LDA' by Hoffman et al.`_
318318
319319 Examples
320320 -------
@@ -396,13 +396,11 @@ def __init__(self, corpus=None, num_topics=100, id2word=None,
396396 * 'auto': Learns an asymmetric prior from the corpus.
397397 decay : float, optional
398398 A number between (0.5, 1] to weight what percentage of the previous lambda value is forgotten
399- when each new document is examined. Corresponds to Kappa from
400- `Hoffman et al. :"Online Learning for Latent Dirichlet Allocation"
401- <https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf>`_.
399+ when each new document is examined.
400+ Corresponds to :math:`\\ kappa` from `'Online Learning for LDA' by Hoffman et al.`_
402401 offset : float, optional
403402 Hyper-parameter that controls how much we will slow down the first steps the first few iterations.
404- Corresponds to Tau_0 from `Hoffman et al. :"Online Learning for Latent Dirichlet Allocation"
405- <https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf>`_.
403+ Corresponds to :math:`\\ tau_0` from `'Online Learning for LDA' by Hoffman et al.`_
406404 eval_every : int, optional
407405 Log perplexity is estimated every that many updates. Setting this to one slows down training by ~2x.
408406 iterations : int, optional
@@ -862,13 +860,15 @@ def update(self, corpus, chunksize=None, decay=None, offset=None,
862860
863861 Notes
864862 -----
865- This update also supports updating an already trained model with new documents; the two models are then merged
866- in proportion to the number of old vs. new documents. This feature is still experimental for non-stationary
867- input streams. For stationary input (no topic drift in new documents), on the other hand, this equals the
868- online update of `Hoffman et al. :"Online Learning for Latent Dirichlet Allocation"
869- <https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf>`_
870- and is guaranteed to converge for any `decay` in (0.5, 1.0). Additionally, for smaller corpus sizes, an
871- increasing `offset` may be beneficial (see Table 1 in the same paper).
863+ This update also supports updating an already trained model (`self`) with new documents from `corpus`;
864+ the two models are then merged in proportion to the number of old vs. new documents.
865+ This feature is still experimental for non-stationary input streams.
866+
867+ For stationary input (no topic drift in new documents), on the other hand,
868+ this equals the online update of `'Online Learning for LDA' by Hoffman et al.`_
869+ and is guaranteed to converge for any `decay` in (0.5, 1].
870+ Additionally, for smaller corpus sizes,
871+ an increasing `offset` may be beneficial (see Table 1 in the same paper).
872872
873873 Parameters
874874 ----------
@@ -879,13 +879,11 @@ def update(self, corpus, chunksize=None, decay=None, offset=None,
879879 Number of documents to be used in each training chunk.
880880 decay : float, optional
881881 A number between (0.5, 1] to weight what percentage of the previous lambda value is forgotten
882- when each new document is examined. Corresponds to Kappa from
883- `Hoffman et al. :"Online Learning for Latent Dirichlet Allocation"
884- <https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf>`_.
882+ when each new document is examined. Corresponds to :math:`\\ kappa` from
883+ `'Online Learning for LDA' by Hoffman et al.`_
885884 offset : float, optional
886885 Hyper-parameter that controls how much we will slow down the first steps the first few iterations.
887- Corresponds to Tau_0 from `Hoffman et al. :"Online Learning for Latent Dirichlet Allocation"
888- <https://papers.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf>`_.
886+ Corresponds to :math:`\\ tau_0` from `'Online Learning for LDA' by Hoffman et al.`_
889887 passes : int, optional
890888 Number of passes through the corpus during training.
891889 update_every : int, optional
0 commit comments