Skip to content

Commit 01f0fe6

Browse files
committed
fix spacing in doc string
1 parent 7120ce9 commit 01f0fe6

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

src/bagofwords_transformer.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
11
"""
22
BagOfWordsTransformer()
3+
34
Convert a collection of raw documents to matrix representing a bag-of-words structure.
45
Essentially, a bag-of-words approach to representing documents in a matrix is comprised of
56
a count of every word in the document corpus/collection for every document. This is a simple
67
but often quite powerful way of representing documents as vectors. The resulting representation is
78
a matrix with rows representing every document in the corpus and columns representing every word
89
in the corpus. The value for each cell is the raw count of a particular word in a particular
910
document.
11+
1012
Similarly to the `TfidfTransformer`, the vocabulary considered can be restricted
1113
to words occuring in a maximum or minimum portion of documents.
1214
The parameters `max_doc_freq` and `min_doc_freq` restrict the vocabulary

0 commit comments

Comments
 (0)