Skip to content
This repository was archived by the owner on Jul 16, 2024. It is now read-only.

Commit fdef021

Browse files
vgkowskialexvt-amzlmouhibAutomation
authored
feat: dataplatform notebooks (#251)
* initial version or emr on eks construct * added correct dependency for js-yaml * upgrading projen and changing dependencies specification in .projenrc.js * refactoring emr on eks * refactoring emr on eks * refactor EKS cluster creation * updated constructs (not finished) * changes to make original version jsii-compliant * fix to make cdk deploy compile successfully * cdk deploys successfully * stack is in deployable state: cluster autoscaler needs fixing * cluster autoscaler is fixed * comments & cleanup * first draft for managed endpoint based on lambda custom resource, not functional at the moment * changed client name variable to avoid scope confusion (raised by Lotfi) * Initial commit of dataplatform notebook * Commit the dataplatform-notebook constructor and docs * Removing the integ.dataplatform.ts * Included more comments on doc * Updated index.ts and dataplatform-notebook.ts to fix a build issue * Fixed the dependency on the json in dataplatform-notebook.ts and .projenrc.js, the construct now deploy outside of the project * Parametrized the EMR Studio Service role, now accepts user provided ARN * Fixed an issue with addServiceRoleInlinePolicy function which caused it to fail * fix: managed endpoint now works, todo: sort out generation of ssl certificate * fix: add timeout, format changes * added security groups needs for emrstudio * fix: support multiple managed endpoints * Update with EKSon EMR construct and update in dataplatform-notebook.ts construct to allow mapping of users and managedEndpoint * added executionRoleArn as props * fix: nodegroup taints, added configurationOverride * Updated the code to manage the TBAC for EMR Studio workspaces * Added the creation of a custom lambda IAM role for tagging EMR notebooks (Studio Workspaces) and created loggroups for this lambda * Cleaned the construct code from warnings * Add changes to support passing execution role to managedEndpoint creation in dataplatform-notebook.ts * Update the construct to take managedendpoint execution role, implement encryption for cloudwatch and S3 bucket * Updated the README.md and changed the way studioname is passed to addUser method. It is now accessed as a private property from within the class * upgrade cdk * Refactored dataplatform-notebook.ts and added iamRoleAndPolicyHelper.ts to prepare for IAM auth implementation * Implemented support of IAM Federated for provisioning users in EMR Studio * Fixed an issue with Federate IAM access and applied a scopped down policy * Since two commits bumped cdk to latest to include new types for EMR Studio CfnProps. Fixed some warnings and addressed the @ts-ignore * Update the integ.default.ts to be used for studio testin, studioAuthMode is now exposed as ENUM to developers. * Updating index and cleaning the integ.default.ts * Added helper function for sanitizing string and refactored the IAM helper file * feature: nodegroup improvements and code refactoring * fix: nodegroup taints and unit tests * fix: cleaned up the managedendpoint-cr and moved a lambda code used in dataplatform-notebook construct under /lambdas * fix: tooling nodegroup labels for cluster autoscaler * fix: added check on managedendpoint name lenght to be less than or equal to 64 * fix: fixed IAM roles created for EMR Studio with IDs linked to the studio itself. This allows to have multiple stack using this construct. * fix: add waiting for the endpoint termination status * feature: added the capability to have more than one endpoint assigned to a user * fix: fixed issue with multiple endpoints where the IAM policy failed validation * add license * add license * add managed endpoint dependency on eksCluster * modve helpers to commn utils * fix: Add the dependency on alb-controller for managed endpoint, pulled latest fix in dataplatform-notebook branch * fix: cleaned the code from unused prop, improved the managedendpoint creation and its user mapping * fix: Added comments in the code * fix: clarified some comments in dataplatform-notebook.ts * fix: added tests for dataplatform-notebook.ts construct, fixed the name of engine security group * fix: add the option to add tags to managedendpoint at creation in emr-eks-cluster.ts, need to implement the rest of changes in lambda CR * fix: updated the test for dataplatform-notebook.ts, improved the naming for IAM policies and managed-endpoint * fix: fixed an issue in IAM_Federated when passing a user with multiple execution role * feature: Implemented EMR Studio deployment with IAM User authentication * fix: solved permission issue causing IAM users not to be able to change password during first Auth * fix: attach permission to IAM user * fix: simplify code in emr virtual cluster * change execution role creation * fix: introduced the randomize function in the Utils class * fix: optimized the way users are add, instead of a method for each authentication method, only one method is used for all of them * fix: replaced the use of ARN for the policy to be used with execution role to the use the policy name * refactoring managed endpoint * adding default EMR config and pod templates for default nodegroups * fix in the comments * fix: added a singleton construct for emr-eks-cluster.ts this fix the failling deployment of two objects of dataplatform-notebook.ts due too the support of only one EKS cluster per stack. * fix: added a singleton construct for emr-eks-cluster.ts this fix the failling deployment of two objects of dataplatform-notebook.ts due too the support of only one EKS cluster per stack. * implemented test for singleton-emr-eks-cluster.ts * adding default EMR config and pod templates for default nodegroups * refactoring managed endpoint custom resource * update to README.md * fix: change in addManagedEndpoint in emr-eks-cluster.ts to support multistack, function now has one more parameter to take the scope of the a stack. * fix: change in addManagedEndpoint in emr-eks-cluster.ts to support multistack, function now has one more parameter to take the scope of the a stack. * fix: cleaning the code from previous testing on multistack * feat: add emr on eks construct * fix: adding check on if a studio exist before it is created to avoid deployment issues and cleaning files * fix: check constraint on namespace before its creation, only allowing lowercase alphanumeric * refactor: Refactored to meet the new project structure and added a check on the uniqueness of an EMR Studio and a Namespace across the stack * fix: fixed an issue causing the lambda for tagging EMR workspaces to fail to deploy * docs: rewrote the README.md with the new instruction and way to deploy the dataplaform notebook construct * docs: updated the README.md * updated the integ.default.ts * docs: fix in the documentation for dataplatform-notebook.ts * fix: fix in emr-eks-cluster.ts to pass the test case and scoped the CR permissions * adding the adding eslint and gitignore * fix: fixed the handling of dataplatform.ts singleton and enhanced documentation * doc: added some clarification in README.md on what the construct goals and what it deploys * doc: added some clarification in README.md on what the construct goals and what it deploys * feat: this feature adds the ability to deploy a Jupyter notebook infrastructure based on multiple EMR Studio, EMR virtual Clusters and an EKS cluster, then assign users to each of the EMR Studios. * fix: fixed an issue in the dataplatform.ts singleton * doc: cleaning comments in the code and documentation * doc: cleaning documentation for construct and methods that are internal to the notebook module. * feat: added support for custom VPC for * Merge with Main branch commit 2b1d630 * fix: improved error handling of bad input and cleaning diff.json and diff2.json * fix: Changed the way user can bring their custom VPC. EmrEksProps has now as optional a property of VpcAttributes which takes the VPC id, AZs and Subnet ids. Changed addEmrVirtualCluster to take the Scope of the stack where VC is to be deployed, enhancing the coupling between Stack and resource. * fix: enhancement in the IAM policies for EMR Studio in /notebooks-data-platform/resources/studio * fix: in the IAM policies for EMR Studio in /notebooks-data-platform/resources/studio * fix: in integration and unit test due to changes introduced in emr-eks-cluster.ts * fix: changed emr-eks-cluster.ts to ensure uniqueness of resource construct id in L134 * doc: fix in the documentation in dataplatform.ts * fix: dataplatform-notebook.ts had issues with creating IAM users and outputting the IAM user information * test: added tests for dataplatform-notebook.ts * Delete yarm.lock causing issue with build automation * chore: self mutation * code review * fix: making the managedendpoint stable and changing the managed-endpoint CR lambda to python * chore: self mutation * fix: raised error if the name of the dataplatform is not part of the stack and user tries to create or assign a user to a dataplatform (EMR STUDIO) * review code and fixing tokens in logical ID * review code and fixing tokens in logical ID * add managed endpoint name * fix: improvement of error handling at compile time * fix: minor improvements and fix * feat: add ASG tagging for EKS nodegroup to support 0 min * feat: added the ability to validate user provided EMR on EKS config override * Create THIRD-PARTY-LICENSES in /core directory * fix: updated the configOverrideJSONSchema.json to be strict * fix: change in library json schema validation for configOverrideSchemaValidation.ts which was causing jsii failing the compilation * fix: clean the code to make user provided configOverride for managedEndpoint ready for use * fix: various bug fix for the handling of user provided overrideConfig * doc: NotebookPlatform documentation enhancement * feat: added the support in the default EMR Studio user policy of new EMR Studio feature for real-time collaborative notebooks * doc: documentation enhancement of notebook-platform.ts * test: Added tests for notebook-platform.ts, these tests support nested stacks * doc: changes to notebook-platform.ts documentation * fix: changes on IAM policy for emr-eks-nodegroup-asg-tag.ts * bump cdk to 1.139 to work with cdk-nag * fix: issue with datalake exporter unit test, fix the snapshot * fix: change to parameters of S3 and KMS creation to resolve the cdk-nag errors * Added cdk-nag sample * update API.md * updated the THIRD-PARTY-LICENSES to include flyway license * fix: remove cdk-nag and opened a specific branch for using cdk-nag * fix build Co-authored-by: Alex Tarasov <alexvt@amazon.com> Co-authored-by: Lotfi Mouhib <mouhib@amazon.com> Co-authored-by: lmouhib <mouhib.lotfi@gmail.com> Co-authored-by: Automation <github-actions@github.com>
1 parent 6f98018 commit fdef021

File tree

6 files changed

+5592
-6651
lines changed

6 files changed

+5592
-6651
lines changed

core/.projen/deps.json

Lines changed: 0 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

core/.projenrc.js

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,6 @@ const project = new awscdk.AwsCdkConstructLibrary({
8080
'esbuild',
8181
'aws-cdk@1.139.0',
8282
'jest-runner-groups',
83-
'cdk-nag@^1.0.0',
8483
],
8584

8685
jestOptions: {

core/THIRD-PARTY-LICENSES

Lines changed: 0 additions & 45 deletions
This file was deleted.

core/package.json

Lines changed: 4 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

core/src/nag.default.ts

Lines changed: 0 additions & 56 deletions
This file was deleted.

0 commit comments

Comments
 (0)