You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/Mathematical_Foundations.md
+40-4Lines changed: 40 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -165,10 +165,6 @@ Further improvements to the implementation of the Theil-Sen is an active topic o
165
165
166
166
See [`/app/engine/utils/MovingWindowRegressor.js`](../app/engine/utils/MovingWindowRegressor.js)
167
167
168
-
## Open design issues
169
-
170
-
@@@@@
171
-
172
168
### Choices for the specific algorithms
173
169
174
170
#### Implementation choices in the Theil-Sen regression
@@ -198,6 +194,46 @@ Comparison across these tables shows that using the Goodness Of Fit is needed to
198
194
199
195
Finding a better approximation algorithm that ingores outlying values while maintaining the true data responsiveness is a subject for further improvement.
200
196
197
+
## Open Issues, Known problems and Regrettable design decissions
198
+
199
+
### Using iteration instead of running sums for Theil-Sen Goodness Of Fit
200
+
201
+
Currently, both the Theil-Sen regressors (i.e. `TSLinearSeries.js` and `TSQuadraticSeries.js`) iterate over the datapoints in the flank, and determine the sse and sst for the Goodness of Fit calculation. There is an alternative approach, using running sums. THis approach has the huge benefit that running sums are less CPU intensive to maintain, and don't force a sudden large calculation. Key concern here is the dragfactor calculation at the end of a recovery that iterates over a large collection (often over 200 datapoints for a Concept2 RowErg), resulting in a significant workload at one specific moment. When using running sums that are maintained throughout the recovery, it should maintain a lower profile as much of the work is done in small pieces throughout the recovery phase.
Where $(x_i, y_i)$ is the i-th datapoint in the flank, and $weight_i$ its weight. $\overline{Y}$ is the weighted average of the entire flank in the y axis. a and b are the coefficients in $y = a x + b$
Where $(x_i, y_i)$ is the i-th datapoint in the flank, and $weight_i$ its weight. $\overline{Y}$ is the weighted average of the entire flank in the y axis. a, b and c are the coefficients in $y = a x^2 + b x + c$
230
+
231
+
However, these implementations suffered from numerical instability. This exposed itself ar relatively small sessions (a 2500 meter row on a Concept2 RowErg) where Goodness Of Fit started to drift, and error between the iteration and running sum started to grow from $10^-15$ to $10^-2$. This latter disturbs the functioning of OpenRowingMonitor. As in the running sum variation a Goodness Of Fit over 1 was frequently encountered, we considered it very likely that it is faulty. Making the underlying `Series.js` object, that is responsible for maintaing these running sums, much more robust by forcing continuous recalculations of these running sums did not resolve this issue.
232
+
233
+
The current implementation thus relies on the iterative approach, despite the running sum being computationally much more efficient.
234
+
235
+
@@@@@
236
+
201
237
## References
202
238
203
239
<aid="1">[1]</a> Anu Dudhia, "The Physics of ErgoMeters" <http://eodg.atm.ox.ac.uk/user/dudhia/rowing/physics/ergometer.html>
0 commit comments