Skip to content

Commit 24d9fba

Browse files
author
Rahul Iyer
committed
Documentation: Fix various inconsistencies in documentation
Pivotal Tracker: 58478260 Additional authors: - Hai Qian <hqian@gopivotal.com> - Shengwen Yang <syang@gopivotal.com> - Xixuan Feng <xfeng@gopivotal.com> Changes: - Complete the Release notes - Gppkg version number set to 1.8 - Fix various documentation errors in multiple modules - Changed incorrect function declaration in margins_mlogregr
1 parent 5480d2f commit 24d9fba

15 files changed

+606
-347
lines changed

ReleaseNotes.txt

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,70 @@ A complete list of changes for each release can be obtained by viewing the git
88
commit history located at https://github.com/madlib/madlib/commits/master.
99

1010
Current list of bugs and issues can be found at http://jira.madlib.net.
11+
--------------------------------------------------------------------------------
12+
MADlib v1.4
13+
14+
Release Date: 2013-Nov-25
15+
16+
New Features:
17+
* Improved interface for Multinomial logistic regression:
18+
- Added a new interface that accepts an 'output_table' parameter and
19+
stores the model details in the output table instead of returning as a struct
20+
data type. The updated function also builds a summary table that includes
21+
all parameters and meta-parameters used during model training.
22+
- The output table has been reformatted to present the model coefficients
23+
and related metrics for each category in a separate row. This replaces the
24+
old output format of model stats for all categories combined in a
25+
single array.
26+
* Variance Estimators
27+
- Added Robust Variance estimator for Cox PH models (Lin and Wei, 1989).
28+
It is useful in calculating variances in a dataset with potentially
29+
noisy outliers. Namely, the standard errors are asymptotically normal even
30+
if the model is wrong due to outliers.
31+
- Added Clustered Variance estimator for Cox PH models. It is used
32+
when data contains extra clustering information besides covariates and
33+
are asymptotically normal estimates.
34+
* NULL Handling:
35+
- Modified behavior of regression modules to 'omit' rows containing NULL
36+
values for any of the dependent and independent variables. The number of
37+
rows skipped is provided as part of the output table.
38+
This release includes NULL handling for following modules:
39+
- Linear, Logistic, and Multinomial logistic regression, as well as
40+
Cox Proportional Hazards
41+
- Huber-White sandwich estimators for linear, logistic, and multinomial
42+
logistic regression as well as Cox Proportional Hazards
43+
- Clustered variance estimators for linear, logistic, and multinomial
44+
logistic regression as well as Cox Proportional Hazards
45+
- Marginal effects for logistic and multinomial logistic regression
46+
47+
Deprecated functions:
48+
- Multinomial logistic regression function has been renamed to
49+
'mlogregr_train'. Old function ('mlogregr') has been deprecated,
50+
and will be removed in the next major version update.
51+
52+
- For all multinomial regression estimator functions (list given below),
53+
changes in the argument list were made to collate all optimizer specific
54+
arguments in a single string. An example of the new optimizer parameter is
55+
'max_iter=20, optimizer=irls, precision=0.0001'.
56+
This is in contrast to the original argument list that contained 3 arguments:
57+
'max_iter', 'optimizer', and 'precision'. This change allows adding new
58+
optimizer-specific parameters without changing the argument list.
59+
Affected functions:
60+
- robust_variance_mlogregr
61+
- clustered_variance_mlogregr
62+
- margins_mlogregr
63+
64+
Bug Fixes:
65+
- Fixed an overflow problem in LDA by using INT64 instead of INT32.
66+
- Fixed integer to boolean cast bug in clustered variance for logistic
67+
regression. After this fix, integer columns are accepted for binary
68+
dependent variable using the 'integer to bool' cast rules.
69+
- Fixed two bugs in SVD:
70+
- The 'example' option for online help has been fixed
71+
- Column names for sparse input tables in the 'svd_sparse' and
72+
'svd_sparse_native' functions are no longer restricted to 'row_id',
73+
'col_id' and 'value'.
74+
1175
--------------------------------------------------------------------------------
1276
MADlib v1.3
1377

deploy/gppkg/CMakeLists.txt

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@
22
# Packaging for Greenplum's gppkg
33
# ------------------------------------------------------------------------------
44

5-
# set(MADLIB_GPPKG_VERSION "ossv1.4_pv1.7.2_gpdb4.2")
6-
set(MADLIB_GPPKG_VERSION "1.7.2")
5+
set(MADLIB_GPPKG_VERSION "1.8")
76
set(MADLIB_GPPKG_RELEASE_NUMBER 1)
87
set(MADLIB_GPPKG_RPM_SOURCE_DIR
98
"${CMAKE_BINARY_DIR}/_CPack_Packages/Linux/RPM/${CPACK_PACKAGE_FILE_NAME}"

src/ports/postgres/modules/regress/clustered_variance.py_in

Lines changed: 64 additions & 64 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)