Skip to content

Commit 332dff7

Browse files
committed
Release 7.062
1 parent 7e10713 commit 332dff7

36 files changed

+71
-68
lines changed

CHANGELOG.md

Lines changed: 34 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,97 +1,100 @@
11
# Changelog
2+
# 7.062
3+
* hamf bugfix in apply-concat.
4+
25
# 7.061
36
* Upgrade to hamf to fix pmap with custom pool issue and initial cut at sparse columns. There is no serialization
47
yet as that requires significant changes to arrow to work for our intended use case.
5-
8+
69
# 7.060
7-
* Fixes [issue 458](https://github.com/techascent/tech.ml.dataset/issues/458) - replace missing with a value
10+
* Fixes [issue 458](https://github.com/techascent/tech.ml.dataset/issues/458) - replace missing with a value
811
works correctly when a column contains all missing values.
9-
12+
1013
# 7.059
1114
* dtype-next upgrade to fix clone-after-filter issue.
12-
15+
1316
# 7.058
1417
* faster single column reduction when you have large columns and many missing -- avoids per-idx binary search of missing set.
15-
18+
1619
# 7.057
1720
* Slightly faster arrow compressed writies.
1821
* column-cast no longer appends roaring bitmaps to metadata unless requested.
19-
22+
2023
# 7.056
21-
* Arrow support for UUID and bigdecimal types.
22-
24+
* Arrow support for UUID and bigdecimal types.
25+
2326
# 7.055
2427
* Upgrade dtype-next to [version 10.136](https://github.com/cnuernber/dtype-next/blob/master/CHANGELOG.md#10136).
25-
28+
2629
# 7.053
2730
* Column parsers are more rigorous in promoting their datatypes after clear op.
28-
31+
2932
# 7.052
3033
* Fixing update-values - found untested (on mac) pathway that failed when run in cloud.
31-
34+
3235
# 7.051
3336
* Much faster string table clone and much faster arrow write of string tables.
34-
37+
3538
# 7.050
3639
* fix bug in stringtable clone.
37-
40+
3841
# 7.049
3942
* Optimizations to string table clone, string table create and arrow serialization.
40-
43+
4144
# 7.047
4245
* hamf bugfix for update-values.
43-
46+
4447
# 7.046
4548
* dataset parsers return something that is not a dataset when the internal datasets have no columns.
46-
49+
4750
# 7.045
4851
* Bulk add-constant! method used for adding missing values.
49-
52+
5053
# 7.044
5154
* initial support for clearing dataset parsers - resets their row count but does not reset the schema. Use tech.v3.dataset.protocols/ds-clear.
52-
53-
# 7.043
55+
56+
# 7.043
5457
* Legacy smile -- 2.6.0 -- support was removed. Support for later smile versions has moved to the [scicloj system](https://github.com/scicloj/scicloj.ml.smile) and operations like PCA are best implemented at this time using neanderthal.
55-
58+
5659
# 7.042
5760
* Upgrade hamf to get new api methods - lines and re-matches.
58-
61+
5962
# 7.041
6063
* Slightly faster promotional object parser.
61-
64+
6265
# 7.040
6366
* Fix for [issue 450](https://github.com/techascent/tech.ml.dataset/issues/450) - emapped columns could reduce as
64-
a different type than declared in the emap declaration.
67+
a different type than declared in the emap declaration.
6568
* Small perf improvements for unique-by.
66-
69+
6770
# 7.039
6871
* Fix error in dtype-next/native-buffer/native-buffer->byte-array
69-
72+
7073
# 7.038
7174
* Upgrade to hamf 2.020.
7275
* Fix for [issue 447](https://github.com/techascent/tech.ml.dataset/issues/447) - filter column by keyword.
7376

7477
# 7.037
7578
* Nippy loading is about 2x faster in the case of large string tables.
7679
* Arrow read pathways support :text-as-strings? to mirror :strings-as-text? on the write side so you can save out uncompressed data in the fastest-to-read format.
77-
80+
7881
# 7.036
7982
* Major optimization (>9x!) loading of arrow files when large string tables/dictionaries are used.
80-
83+
8184
# 7.035
8285
* Latest dtype-next (10.124) - contains upgrades to ham-fisted which allow pmap et al. to accept arbitrary executor services.
8386
* Fix for [issue 438](https://github.com/techascent/tech.ml.dataset/issues/438) - keyword dataset names in tribuo.
8487
* Fix for [issue 435](https://github.com/techascent/tech.ml.dataset/issues/435) - pd-merge's outer must accept empty datasets.
8588
* Fix for issues 432 and 371 - select-row-type operations don't remove `:print-index-range :all` metadata.
86-
87-
88-
89+
90+
91+
8992
# 7.034
9093
* Reverted transit encoding of instant back to milliseconds since epoch as js api doesn't support microseconds since epoch.
9194

9295
# 7.033
9396
* [issue-434](https://github.com/techascent/tech.ml.dataset/issues/413) - bad transit encoding - packed instants are microseconds since epoch and have been for a while - not milliseconds since epoch.
94-
97+
9598
# 7.031
9699
* [issue-413](https://github.com/techascent/tech.ml.dataset/issues/413) - reduce with packed columns.
97100
* [issue-414](https://github.com/techascent/tech.ml.dataset/issues/414) - categorical maps are now integers.

deps.edn

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{:paths ["src" "resources" "target/classes"]
22
:deps {;;org.clojure/clojure {:mvn/version "1.11.1"}
3-
cnuernber/dtype-next {:mvn/version "10.142"}
3+
cnuernber/dtype-next {:mvn/version "10.143"}
44
techascent/tech.io {:mvn/version "4.31"
55
:exclusions [org.apache.commons/commons-compress]}
66
org.apache.datasketches/datasketches-java {:mvn/version "4.2.0"}
@@ -14,7 +14,7 @@
1414
:exec-fn codox.main/-main
1515
:exec-args {:group-id "techascent"
1616
:artifact-id "tech.ml.dataset"
17-
:version "7.061"
17+
:version "7.062"
1818
:name "TMD"
1919
:description "A Clojure high performance data processing system"
2020
:metadata {:doc/format :markdown}

docs/000-getting-started.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

docs/100-walkthrough.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

docs/200-quick-reference.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

docs/columns-readers-and-datatypes.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

docs/index.html

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

docs/nippy-serialization-rocks.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

docs/supported-datatypes.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

docs/tech.v3.dataset.categorical.html

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)