Skip to content

Commit 036d1fc

Browse files
nicodvtovbinmwsuchyJauntbox
authored
0.7.0 release (#481)
* Revert "Revert back to Spark 2.3 (#399)" This reverts commit 95a77b1. * Update to Spark 2.4.3 and XGBoost 0.90 * special double serializer fix * fix serialization * fix serialization * docs * fixed missng value for test * meta fix * Updated DecisionTreeNumericMapBucketizer test to deal with the change made to decision tree pruning in Spark 2.4. If nodes are split, but both child nodes lead to the same prediction then the split is pruned away. This updates the test so this doesn't happen for feature 'b' * fix params meta test * FIxed failing xgboost test * ident * cleanup * added dataframe reader and writer extensions * added const * cherrypick fixes * added xgboost params + update models to use public predict method * blarg * double ser test * update mleap and spark testing base * Update README.md * type fix * bump minor version * Update Spark version in the README * bump version * Update build.gradle * Update pom.xml * set correct json4s version * upgrade helloworld deps * upgrade notebook deps on TMog and Spark * bump to version 0.7.0 for Spark update * align helloworld dependencies * align helloworld dependencies * get -> getOrElse with exception * fix helloworld compilation * style * WIP release notes * TMog version bump * update release notes * update release notes * updates to changelog * updates to changelog * updates to changelog * updates to changelog * updates to changelog * updates to changelog * fix changelog * fix changelog * keep helloworld on 0.6.1 until release Co-authored-by: Matthew Tovbin <[email protected]> Co-authored-by: Matthew Tovbin <[email protected]> Co-authored-by: Christopher Suchanek <[email protected]> Co-authored-by: Kevin Moore <[email protected]> Co-authored-by: Matthew Tovbin <[email protected]>
1 parent e48831a commit 036d1fc

File tree

5 files changed

+46
-5
lines changed

5 files changed

+46
-5
lines changed

CHANGELOG.md

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,46 @@
11
# Changelog
22

3+
## 0.7.0
4+
5+
Bug fixes:
6+
- Fix flaky `ModelInsight` tests [#407](https://github.com/salesforce/TransmogrifAI/pull/407)
7+
- Remove logging of tokens of text fields [#420](https://github.com/salesforce/TransmogrifAI/pull/420), [#438](https://github.com/salesforce/TransmogrifAI/pull/438), [#447](https://github.com/salesforce/TransmogrifAI/pull/447), [#474](https://github.com/salesforce/TransmogrifAI/pull/474)
8+
- Add validation prepare call before model selection when no DAG is passed [#424](https://github.com/salesforce/TransmogrifAI/pull/424), [#429](https://github.com/salesforce/TransmogrifAI/pull/429)
9+
- Fix `Days.daysBetween` int overflow [#471](https://github.com/salesforce/TransmogrifAI/pull/471)
10+
11+
New features / updates:
12+
- Downsample the number of training samples to `maxTrainingSample` for regression [#413](https://github.com/salesforce/TransmogrifAI/pull/413) and multi-class classification [#414](https://github.com/salesforce/TransmogrifAI/pull/414)
13+
- Refactor `InsightLOCOTest` [#412](https://github.com/salesforce/TransmogrifAI/pull/412)
14+
- Enable more loss types for `OpLinearRegression` [#421](https://github.com/salesforce/TransmogrifAI/pull/421)
15+
- Add property-based tests for regression model selection [#427](https://github.com/salesforce/TransmogrifAI/pull/427)
16+
- Add option to calculate LOCO for dates/texts by leaving out their entire vector [#418](https://github.com/salesforce/TransmogrifAI/pull/418)
17+
- Add Chinese and Korean examples to `TextTokenizerTest` [#442](https://github.com/salesforce/TransmogrifAI/pull/442)
18+
- Add support for ignoring text that looks like IDs in `SmartTextVectorizer` [#448](https://github.com/salesforce/TransmogrifAI/pull/448), [#455](https://github.com/salesforce/TransmogrifAI/pull/455)
19+
- Add a unary estimator for detecting names in text fields and transforming to likely gender [#445](https://github.com/salesforce/TransmogrifAI/pull/445)
20+
- Allow result features to be removed by raw feature filter [#458](https://github.com/salesforce/TransmogrifAI/pull/458)
21+
- Metadata changes for sensitive feature information [#457](https://github.com/salesforce/TransmogrifAI/pull/457)
22+
- Add `MinVarianceFilter` which checks that computed features have a minimum variance [#463](https://github.com/salesforce/TransmogrifAI/pull/463), [#465](https://github.com/salesforce/TransmogrifAI/pull/465)
23+
- Allow `TextStats` length distribution to be token-based and refactor for testability [#464](https://github.com/salesforce/TransmogrifAI/pull/464)
24+
- Use Spark job grouping to distinguish steps of the machine learning flow [#467](https://github.com/salesforce/TransmogrifAI/pull/467), [#468](https://github.com/salesforce/TransmogrifAI/pull/468), [#470](https://github.com/salesforce/TransmogrifAI/pull/470)
25+
- Add categorical detection to be coverage based in addition to unique count based [#473](https://github.com/salesforce/TransmogrifAI/pull/473)
26+
- Remove duplicate features using sanity checker feature to feature correlations [#476](https://github.com/salesforce/TransmogrifAI/pull/476), [#479](https://github.com/salesforce/TransmogrifAI/pull/479)
27+
- Lift the upper bound on number of hash features [#477](https://github.com/salesforce/TransmogrifAI/pull/477)
28+
- Enable Html stripping on text-like features [#478](https://github.com/salesforce/TransmogrifAI/pull/478)
29+
30+
Dependency updates ([#402](https://github.com/salesforce/TransmogrifAI/pull/402), [#466](https://github.com/salesforce/TransmogrifAI/pull/466)):
31+
- Update Apache Spark version to 2.4.5
32+
- Avro is a built-in data source in Spark 2.4, so no longer using the spark-avro package
33+
- Avro to 1.8.2
34+
- XGBoost to 0.90
35+
- MLeap to 0.14.0
36+
- json4s to 3.5.3
37+
- JUnit to 4.12
38+
- chill to 0.9.3
39+
- gradle-avro-plugin to 0.16.0
40+
41+
Miscellaneous:
42+
- Add ROADMAP.md [#394](https://github.com/salesforce/TransmogrifAI/pull/394)
43+
344
## 0.6.1
445

546
Bug fixes:
@@ -19,7 +60,7 @@ New features / updates:
1960
- Use compact and compressed model json by default [#375](https://github.com/salesforce/TransmogrifAI/pull/375)
2061
- Descale feature contribution for Linear Regression & Logistic Regression [#345](https://github.com/salesforce/TransmogrifAI/pull/345)
2162

22-
Dependency updates:
63+
Dependency updates:
2364
- Update tika version [#382](https://github.com/salesforce/TransmogrifAI/pull/382)
2465

2566
## 0.6.0

gradle.properties

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
version=0.7.0-SNAPSHOT
1+
version=0.7.0
22
group=com.salesforce.transmogrifai
33
org.gradle.caching=true

helloworld/notebooks/OpHousingPrices.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
"metadata": {},
1717
"outputs": [],
1818
"source": [
19-
"%classpath add mvn com.salesforce.transmogrifai transmogrifai-core_2.11 0.7.0"
19+
"%classpath add mvn com.salesforce.transmogrifai transmogrifai-core_2.11 0.6.1"
2020
]
2121
},
2222
{

helloworld/notebooks/OpIris.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
"metadata": {},
1818
"outputs": [],
1919
"source": [
20-
"%classpath add mvn com.salesforce.transmogrifai transmogrifai-core_2.11 0.7.0"
20+
"%classpath add mvn com.salesforce.transmogrifai transmogrifai-core_2.11 0.6.1"
2121
]
2222
},
2323
{

helloworld/notebooks/OpTitanicSimple.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
"metadata": {},
2323
"outputs": [],
2424
"source": [
25-
"%classpath add mvn com.salesforce.transmogrifai transmogrifai-core_2.11 0.7.0"
25+
"%classpath add mvn com.salesforce.transmogrifai transmogrifai-core_2.11 0.6.1"
2626
]
2727
},
2828
{

0 commit comments

Comments
 (0)