Skip to content

Commit 2943d50

Browse files
author
AWS
committed
AWS Clean Rooms Service Update: AWS Clean Rooms now supports privacy-enhancing synthetic dataset generation for custom ML training.
1 parent d4c3417 commit 2943d50

File tree

2 files changed

+157
-2
lines changed

2 files changed

+157
-2
lines changed
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"type": "feature",
3+
"category": "AWS Clean Rooms Service",
4+
"contributor": "",
5+
"description": "AWS Clean Rooms now supports privacy-enhancing synthetic dataset generation for custom ML training."
6+
}

services/cleanrooms/src/main/resources/codegen-resources/service-2.json

Lines changed: 151 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2328,6 +2328,10 @@
23282328
"errorMessageConfiguration":{
23292329
"shape":"ErrorMessageConfiguration",
23302330
"documentation":"<p>The configuration that specifies the level of detail in error messages returned by analyses using this template. When set to <code>DETAILED</code>, error messages include more information to help troubleshoot issues with PySpark jobs. Detailed error messages may expose underlying data, including sensitive information. Recommended for faster troubleshooting in development and testing environments.</p>"
2331+
},
2332+
"syntheticDataParameters":{
2333+
"shape":"SyntheticDataParameters",
2334+
"documentation":"<p>The parameters used to generate synthetic data for this analysis template.</p>"
23312335
}
23322336
},
23332337
"documentation":"<p>The analysis template.</p>"
@@ -2463,6 +2467,10 @@
24632467
"description":{
24642468
"shape":"ResourceDescription",
24652469
"documentation":"<p>The description of the analysis template.</p>"
2470+
},
2471+
"isSyntheticData":{
2472+
"shape":"Boolean",
2473+
"documentation":"<p>Indicates if this analysis template summary generated synthetic data.</p>"
24662474
}
24672475
},
24682476
"documentation":"<p>The metadata of the analysis template.</p>"
@@ -2473,7 +2481,7 @@
24732481
},
24742482
"AnalysisTemplateText":{
24752483
"type":"string",
2476-
"max":90000,
2484+
"max":500000,
24772485
"min":0,
24782486
"sensitive":true
24792487
},
@@ -3136,6 +3144,10 @@
31363144
"errorMessageConfiguration":{
31373145
"shape":"ErrorMessageConfiguration",
31383146
"documentation":"<p>The configuration that specifies the level of detail in error messages returned by analyses using this template. When set to <code>DETAILED</code>, error messages include more information to help troubleshoot issues with PySpark jobs. Detailed error messages may expose underlying data, including sensitive information. Recommended for faster troubleshooting in development and testing environments.</p>"
3147+
},
3148+
"syntheticDataParameters":{
3149+
"shape":"SyntheticDataParameters",
3150+
"documentation":"<p>The synthetic data generation parameters configured for this collaboration analysis template.</p>"
31393151
}
31403152
},
31413153
"documentation":"<p>The analysis template within a collaboration.</p>"
@@ -3194,6 +3206,10 @@
31943206
"description":{
31953207
"shape":"ResourceDescription",
31963208
"documentation":"<p>The description of the analysis template.</p>"
3209+
},
3210+
"isSyntheticData":{
3211+
"shape":"Boolean",
3212+
"documentation":"<p>Indicates if this collaboration analysis template uses synthetic data generation.</p>"
31973213
}
31983214
},
31993215
"documentation":"<p>The metadata of the analysis template within a collaboration.</p>"
@@ -3832,10 +3848,27 @@
38323848
},
38333849
"documentation":"<p>A column within a schema relation, derived from the underlying table.</p>"
38343850
},
3851+
"ColumnClassificationDetails":{
3852+
"type":"structure",
3853+
"required":["columnMapping"],
3854+
"members":{
3855+
"columnMapping":{
3856+
"shape":"ColumnMappingList",
3857+
"documentation":"<p>A mapping that defines the classification of data columns for synthetic data generation and specifies how each column should be handled during the privacy-preserving data synthesis process.</p>"
3858+
}
3859+
},
3860+
"documentation":"<p>Contains classification information for data columns, including mappings that specify how columns should be handled during synthetic data generation and privacy analysis.</p>"
3861+
},
38353862
"ColumnList":{
38363863
"type":"list",
38373864
"member":{"shape":"Column"}
38383865
},
3866+
"ColumnMappingList":{
3867+
"type":"list",
3868+
"member":{"shape":"SyntheticDataColumnProperties"},
3869+
"max":1000,
3870+
"min":5
3871+
},
38393872
"ColumnName":{
38403873
"type":"string",
38413874
"max":128,
@@ -4774,6 +4807,10 @@
47744807
"errorMessageConfiguration":{
47754808
"shape":"ErrorMessageConfiguration",
47764809
"documentation":"<p>The configuration that specifies the level of detail in error messages returned by analyses using this template. When set to <code>DETAILED</code>, error messages include more information to help troubleshoot issues with PySpark jobs. Detailed error messages may expose underlying data, including sensitive information. Recommended for faster troubleshooting in development and testing environments.</p>"
4810+
},
4811+
"syntheticDataParameters":{
4812+
"shape":"SyntheticDataParameters",
4813+
"documentation":"<p>The parameters for generating synthetic data when running the analysis template.</p>"
47774814
}
47784815
}
47794816
},
@@ -7937,10 +7974,49 @@
79377974
"modelInference":{
79387975
"shape":"ModelInferencePaymentConfig",
79397976
"documentation":"<p>The payment responsibilities accepted by the member for model inference.</p>"
7977+
},
7978+
"syntheticDataGeneration":{
7979+
"shape":"SyntheticDataGenerationPaymentConfig",
7980+
"documentation":"<p>The payment configuration for machine learning synthetic data generation.</p>"
79407981
}
79417982
},
79427983
"documentation":"<p>An object representing the collaboration member's machine learning payment responsibilities set by the collaboration creator.</p>"
79437984
},
7985+
"MLSyntheticDataParameters":{
7986+
"type":"structure",
7987+
"required":[
7988+
"epsilon",
7989+
"maxMembershipInferenceAttackScore",
7990+
"columnClassification"
7991+
],
7992+
"members":{
7993+
"epsilon":{
7994+
"shape":"MLSyntheticDataParametersEpsilonDouble",
7995+
"documentation":"<p>The epsilon value for differential privacy when generating synthetic data. Lower values provide stronger privacy guarantees but may reduce data utility.</p>"
7996+
},
7997+
"maxMembershipInferenceAttackScore":{
7998+
"shape":"MaxMembershipInferenceAttackScore",
7999+
"documentation":"<p>The maximum acceptable score for membership inference attack vulnerability. Synthetic data generation fails if the score for the resulting data exceeds this threshold.</p>"
8000+
},
8001+
"columnClassification":{
8002+
"shape":"ColumnClassificationDetails",
8003+
"documentation":"<p>Classification details for data columns that specify how each column should be treated during synthetic data generation.</p>"
8004+
}
8005+
},
8006+
"documentation":"<p>Parameters that control the generation of synthetic data for machine learning, including privacy settings and column classification details.</p>"
8007+
},
8008+
"MLSyntheticDataParametersEpsilonDouble":{
8009+
"type":"double",
8010+
"box":true,
8011+
"max":10,
8012+
"min":0.0001
8013+
},
8014+
"MaxMembershipInferenceAttackScore":{
8015+
"type":"double",
8016+
"box":true,
8017+
"max":1,
8018+
"min":0.5
8019+
},
79448020
"MaxResults":{
79458021
"type":"integer",
79468022
"box":true,
@@ -7984,7 +8060,6 @@
79848060
"MemberList":{
79858061
"type":"list",
79868062
"member":{"shape":"MemberSpecification"},
7987-
"max":9,
79888063
"min":0
79898064
},
79908065
"MemberSpecification":{
@@ -8215,6 +8290,10 @@
82158290
"modelInference":{
82168291
"shape":"MembershipModelInferencePaymentConfig",
82178292
"documentation":"<p>The payment responsibilities accepted by the member for model inference.</p>"
8293+
},
8294+
"syntheticDataGeneration":{
8295+
"shape":"MembershipSyntheticDataGenerationPaymentConfig",
8296+
"documentation":"<p>The payment configuration for synthetic data generation for this machine learning membership.</p>"
82188297
}
82198298
},
82208299
"documentation":"<p>An object representing the collaboration member's machine learning payment responsibilities set by the collaboration creator.</p>"
@@ -8414,6 +8493,17 @@
84148493
"type":"list",
84158494
"member":{"shape":"MembershipSummary"}
84168495
},
8496+
"MembershipSyntheticDataGenerationPaymentConfig":{
8497+
"type":"structure",
8498+
"required":["isResponsible"],
8499+
"members":{
8500+
"isResponsible":{
8501+
"shape":"Boolean",
8502+
"documentation":"<p>Indicates if this membership is responsible for paying for synthetic data generation.</p>"
8503+
}
8504+
},
8505+
"documentation":"<p>Configuration for payment for synthetic data generation in a membership.</p>"
8506+
},
84178507
"ModelInferencePaymentConfig":{
84188508
"type":"structure",
84198509
"required":["isResponsible"],
@@ -9054,6 +9144,7 @@
90549144
},
90559145
"ProtectedJobParameters":{
90569146
"type":"structure",
9147+
"required":["analysisTemplateArn"],
90579148
"members":{
90589149
"analysisTemplateArn":{
90599150
"shape":"AnalysisTemplateArn",
@@ -10396,6 +10487,64 @@
1039610487
"mx-central-1"
1039710488
]
1039810489
},
10490+
"SyntheticDataColumnName":{
10491+
"type":"string",
10492+
"max":128,
10493+
"min":0,
10494+
"pattern":"[a-z0-9_](([a-z0-9_]+-)*([a-z0-9_]+))?"
10495+
},
10496+
"SyntheticDataColumnProperties":{
10497+
"type":"structure",
10498+
"required":[
10499+
"columnName",
10500+
"columnType",
10501+
"isPredictiveValue"
10502+
],
10503+
"members":{
10504+
"columnName":{
10505+
"shape":"SyntheticDataColumnName",
10506+
"documentation":"<p>The name of the data column as it appears in the dataset.</p>"
10507+
},
10508+
"columnType":{
10509+
"shape":"SyntheticDataColumnType",
10510+
"documentation":"<p>The data type of the column, which determines how the synthetic data generation algorithm processes and synthesizes values for this column.</p>"
10511+
},
10512+
"isPredictiveValue":{
10513+
"shape":"Boolean",
10514+
"documentation":"<p>Indicates if this column contains predictive values that should be treated as target variables in machine learning models. This affects how the synthetic data generation preserves statistical relationships.</p>"
10515+
}
10516+
},
10517+
"documentation":"<p>Properties that define how a specific data column should be handled during synthetic data generation, including its name, type, and role in predictive modeling.</p>"
10518+
},
10519+
"SyntheticDataColumnType":{
10520+
"type":"string",
10521+
"enum":[
10522+
"CATEGORICAL",
10523+
"NUMERICAL"
10524+
]
10525+
},
10526+
"SyntheticDataGenerationPaymentConfig":{
10527+
"type":"structure",
10528+
"required":["isResponsible"],
10529+
"members":{
10530+
"isResponsible":{
10531+
"shape":"Boolean",
10532+
"documentation":"<p>Indicates who is responsible for paying for synthetic data generation.</p>"
10533+
}
10534+
},
10535+
"documentation":"<p>Payment configuration for synthetic data generation.</p>"
10536+
},
10537+
"SyntheticDataParameters":{
10538+
"type":"structure",
10539+
"members":{
10540+
"mlSyntheticDataParameters":{
10541+
"shape":"MLSyntheticDataParameters",
10542+
"documentation":"<p>The machine learning-specific parameters for synthetic data generation.</p>"
10543+
}
10544+
},
10545+
"documentation":"<p>The parameters that control how synthetic data is generated, including privacy settings, column classifications, and other configuration options that affect the data synthesis process.</p>",
10546+
"union":true
10547+
},
1039910548
"TableAlias":{
1040010549
"type":"string",
1040110550
"max":128,

0 commit comments

Comments
 (0)