Skip to content

Commit ff08326

Browse files
committed
Documentation
1 parent fde15c0 commit ff08326

File tree

1 file changed

+21
-12
lines changed

1 file changed

+21
-12
lines changed

README.adoc

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,27 @@ and the last for the response variable. The above request returns the following
145145
}
146146
--------------------------------------------------
147147

148+
=== Data conditions
149+
Due to algorithmic constraints both aggregations result an empty response, if
150+
* the search result size is less or equal than the number of indicated explanatory variables,
151+
* values of the explanatory variables in the search result set is linearly dependent (that means
152+
that a column can be written as a linear combination of the other columns)
153+
154+
155+
## Algorithm
156+
This implementation is based on a new parallel, single-pass OLS estimation algorithm for multiple linear regression
157+
(not yet published). By aggregating
158+
over the data only once and in parallel the algorithm is ideally suited for large-scale, distributed data sets and
159+
in this respect surpasses the majority of existing multi-pass analytical OLS estimators or iterative optimization algorithms.
160+
161+
The overall complexity of the implemented algorithm to estimate the regression coefficients is `O(N C² + C³)`, where
162+
`N` denotes the size of the training data set (the number of documents in the search result set) and `C` the number
163+
of the indicated explanatory variables (fields).
164+
165+
## Examples
166+
...
167+
168+
148169
## Installation
149170

150171
### Elasticsearch 5.x
@@ -165,18 +186,6 @@ Do not forget to restart the node after installing.
165186
| https://github.com/scaleborn/elasticsearch-linear-regression/releases/download/5.3.0.1/elasticsearch-linear-regression-5.3.0.1.zip[5.3.0.1] | 5.3.0 | Jun 1, 2017
166187
|===
167188

168-
## Algorithm
169-
This implementation is based on a new parallel, single-pass OLS estimation algorithm for multiple linear regression
170-
(not yet published). By aggregating
171-
over the data only once and in parallel the algorithm is ideally suited for large-scale, distributed data sets and
172-
in this respect surpasses the majority of existing multi-pass analytical OLS estimators or iterative optimization algorithms.
173-
174-
The overall complexity of the implemented algorithm to estimate the regression coefficients is `O(N C² + C³)`, where
175-
`N` denotes the size of the training data set (the number of documents in the search result set) and `C` the number
176-
of the indicated explanatory variables (fields).
177-
178-
## Examples
179-
...
180189

181190
## License
182191
Copyright 2017 Scaleborn UG (haftungsbeschränkt).

0 commit comments

Comments
 (0)