|
1 | 1 | # Changelog |
| 2 | +# 7.062 |
| 3 | + * hamf bugfix in apply-concat. |
| 4 | + |
2 | 5 | # 7.061 |
3 | 6 | * Upgrade to hamf to fix pmap with custom pool issue and initial cut at sparse columns. There is no serialization |
4 | 7 | yet as that requires significant changes to arrow to work for our intended use case. |
5 | | - |
| 8 | + |
6 | 9 | # 7.060 |
7 | | - * Fixes [issue 458](https://github.com/techascent/tech.ml.dataset/issues/458) - replace missing with a value |
| 10 | + * Fixes [issue 458](https://github.com/techascent/tech.ml.dataset/issues/458) - replace missing with a value |
8 | 11 | works correctly when a column contains all missing values. |
9 | | - |
| 12 | + |
10 | 13 | # 7.059 |
11 | 14 | * dtype-next upgrade to fix clone-after-filter issue. |
12 | | - |
| 15 | + |
13 | 16 | # 7.058 |
14 | 17 | * faster single column reduction when you have large columns and many missing -- avoids per-idx binary search of missing set. |
15 | | - |
| 18 | + |
16 | 19 | # 7.057 |
17 | 20 | * Slightly faster arrow compressed writies. |
18 | 21 | * column-cast no longer appends roaring bitmaps to metadata unless requested. |
19 | | - |
| 22 | + |
20 | 23 | # 7.056 |
21 | | - * Arrow support for UUID and bigdecimal types. |
22 | | - |
| 24 | + * Arrow support for UUID and bigdecimal types. |
| 25 | + |
23 | 26 | # 7.055 |
24 | 27 | * Upgrade dtype-next to [version 10.136](https://github.com/cnuernber/dtype-next/blob/master/CHANGELOG.md#10136). |
25 | | - |
| 28 | + |
26 | 29 | # 7.053 |
27 | 30 | * Column parsers are more rigorous in promoting their datatypes after clear op. |
28 | | - |
| 31 | + |
29 | 32 | # 7.052 |
30 | 33 | * Fixing update-values - found untested (on mac) pathway that failed when run in cloud. |
31 | | - |
| 34 | + |
32 | 35 | # 7.051 |
33 | 36 | * Much faster string table clone and much faster arrow write of string tables. |
34 | | - |
| 37 | + |
35 | 38 | # 7.050 |
36 | 39 | * fix bug in stringtable clone. |
37 | | - |
| 40 | + |
38 | 41 | # 7.049 |
39 | 42 | * Optimizations to string table clone, string table create and arrow serialization. |
40 | | - |
| 43 | + |
41 | 44 | # 7.047 |
42 | 45 | * hamf bugfix for update-values. |
43 | | - |
| 46 | + |
44 | 47 | # 7.046 |
45 | 48 | * dataset parsers return something that is not a dataset when the internal datasets have no columns. |
46 | | - |
| 49 | + |
47 | 50 | # 7.045 |
48 | 51 | * Bulk add-constant! method used for adding missing values. |
49 | | - |
| 52 | + |
50 | 53 | # 7.044 |
51 | 54 | * initial support for clearing dataset parsers - resets their row count but does not reset the schema. Use tech.v3.dataset.protocols/ds-clear. |
52 | | - |
53 | | -# 7.043 |
| 55 | + |
| 56 | +# 7.043 |
54 | 57 | * Legacy smile -- 2.6.0 -- support was removed. Support for later smile versions has moved to the [scicloj system](https://github.com/scicloj/scicloj.ml.smile) and operations like PCA are best implemented at this time using neanderthal. |
55 | | - |
| 58 | + |
56 | 59 | # 7.042 |
57 | 60 | * Upgrade hamf to get new api methods - lines and re-matches. |
58 | | - |
| 61 | + |
59 | 62 | # 7.041 |
60 | 63 | * Slightly faster promotional object parser. |
61 | | - |
| 64 | + |
62 | 65 | # 7.040 |
63 | 66 | * Fix for [issue 450](https://github.com/techascent/tech.ml.dataset/issues/450) - emapped columns could reduce as |
64 | | - a different type than declared in the emap declaration. |
| 67 | + a different type than declared in the emap declaration. |
65 | 68 | * Small perf improvements for unique-by. |
66 | | - |
| 69 | + |
67 | 70 | # 7.039 |
68 | 71 | * Fix error in dtype-next/native-buffer/native-buffer->byte-array |
69 | | - |
| 72 | + |
70 | 73 | # 7.038 |
71 | 74 | * Upgrade to hamf 2.020. |
72 | 75 | * Fix for [issue 447](https://github.com/techascent/tech.ml.dataset/issues/447) - filter column by keyword. |
73 | 76 |
|
74 | 77 | # 7.037 |
75 | 78 | * Nippy loading is about 2x faster in the case of large string tables. |
76 | 79 | * Arrow read pathways support :text-as-strings? to mirror :strings-as-text? on the write side so you can save out uncompressed data in the fastest-to-read format. |
77 | | - |
| 80 | + |
78 | 81 | # 7.036 |
79 | 82 | * Major optimization (>9x!) loading of arrow files when large string tables/dictionaries are used. |
80 | | - |
| 83 | + |
81 | 84 | # 7.035 |
82 | 85 | * Latest dtype-next (10.124) - contains upgrades to ham-fisted which allow pmap et al. to accept arbitrary executor services. |
83 | 86 | * Fix for [issue 438](https://github.com/techascent/tech.ml.dataset/issues/438) - keyword dataset names in tribuo. |
84 | 87 | * Fix for [issue 435](https://github.com/techascent/tech.ml.dataset/issues/435) - pd-merge's outer must accept empty datasets. |
85 | 88 | * Fix for issues 432 and 371 - select-row-type operations don't remove `:print-index-range :all` metadata. |
86 | | - |
87 | | - |
88 | | - |
| 89 | + |
| 90 | + |
| 91 | + |
89 | 92 | # 7.034 |
90 | 93 | * Reverted transit encoding of instant back to milliseconds since epoch as js api doesn't support microseconds since epoch. |
91 | 94 |
|
92 | 95 | # 7.033 |
93 | 96 | * [issue-434](https://github.com/techascent/tech.ml.dataset/issues/413) - bad transit encoding - packed instants are microseconds since epoch and have been for a while - not milliseconds since epoch. |
94 | | - |
| 97 | + |
95 | 98 | # 7.031 |
96 | 99 | * [issue-413](https://github.com/techascent/tech.ml.dataset/issues/413) - reduce with packed columns. |
97 | 100 | * [issue-414](https://github.com/techascent/tech.ml.dataset/issues/414) - categorical maps are now integers. |
|
0 commit comments