Commit 41c21bf
committed
[SYSTEMDS-3822] Fix incorrect sampling in top-k cleaning pipelines
This patch fixes a bug in top-k cleaning pipeline enumeration, where
for datasets with more than 200K rows the sampling ratio was ignored
and always set to 0.6 which means we actually ran with larger data
than expected, if people wanted to sampling very large datasets.1 parent b96cf25 commit 41c21bf
1 file changed
+3
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
65 | | - | |
66 | 65 | | |
67 | | - | |
| 66 | + | |
68 | 67 | | |
69 | 68 | | |
70 | 69 | | |
| |||
76 | 75 | | |
77 | 76 | | |
78 | 77 | | |
79 | | - | |
| 78 | + | |
80 | 79 | | |
81 | 80 | | |
82 | 81 | | |
| |||
271 | 270 | | |
272 | 271 | | |
273 | 272 | | |
274 | | - | |
| 273 | + | |
0 commit comments