Skip to content

Latest commit

 

History

History
191 lines (111 loc) · 9.61 KB

File metadata and controls

191 lines (111 loc) · 9.61 KB

ML Admin Portfolio Set Up

this portfolio is intended to be used by the Project Admins of the Workload accounts. It will be used to create AWS resources for their teams. These resources can include resources Sagemaker Domains, Redshift Clusters...

Prerriquistes

Account access

For cdk deployment you will need the "target account id" and "target region" where you want to deploy the portoflios, we recommend this to be your ML Shared Services Account.

To find the "target account id", click the account information on the top right corner of the console page. The 12-digit number after "Account ID" is the target account id.

Target Account ID

On the left side of the account id, click the region name, and copy the region, which should have the format of --, such as "us-east-1".

Target Account Region

Use the following command to find "target account profile". For example, mine is called "default".

aws configure list-profiles

Profile

You can also get temporary programatic access by going into the Iam Identity Center sign in portal, looking for the account where you need AWS CLI access, click on the Command line or programmatic access link and follow the instructions.

IAM Identity Center

For more information, read AWS CDK: Bootstrapping.

Required Packages

The AWS Cloud Development Kit (CDK) is written in python. Below is a list of packages requierd to deploy the code in this workshop. It is preferred to use a linux OS to be able to run all cli commands and avoid path issues.

Set up the ML Admin portfolio

We will create the repository that will host the CDK code for the ML Admin templates and the AWS CodePipeline pipeline that will convert this code into Service Catalog products to be shared with Sandbox accounts. We will use AWS CodeConnections to connect our repository to AWS services.

Deploy the Pipeline Stack

Step 1: Bootstrap the Infrastructure of the Shared Services account

In this step, we will bootstrap the infrastructure for the Sagemaker Projects portfolio in the ML Shared Services account.

Clone the code Git Repository to a local directory.

git clone https://github.com/aws-samples/data-and-ml-governance-workshop.git

Change directory to ml-platform-shared-services/module-3/sagemaker-projects-portfolio directory.

cd data-and-ml-governance-workshop/module-3/ml-admin-portfolio

Install dependencies in a separate python environment using your favourite python packages manager.

python3 -m venv env
source env/bin/activate
pip install -r requirements.txt

Bootstrap your deployment target account using the following command:

cdk bootstrap aws://<target account id>/<target region> --profile <target account profile>

or if you already have a role and region from the account set up simply:

cdk bootstrap

Step 2: Create the Service Catalog portfolio

Now we are going to set up the required resources in our ML Shared Services Account. For that follow this steps:

Deploy the stack with the Code and the corresponding pipeline.

cdk deploy --all --require-approval never

This may take a few minutes. Once it's finished, you should see the message containing the ARN of the deployed stack.

Let's check the stack deployed.

First, navigate to the AWS CloudFormation console.

CloudFormation

Then click "Stacks" on the CloudFormation page.

You should see a stack named "SmProjectsServiceCatalogPipeline". This is the stack that created resources such as CodeConnections, CodePipeline, S3 buckets, and etc.

Let's check out the resources created. Take the CodeConnection connection as an example.

Type "Service Catalog" in the search bar, and then click "Service Catalog" in the dropdown menu. Then select "AWS CodeStar Connections" from the left sidebar.

You can see there's a connection named "codeconnection-service-catalog". If you click the connection, you will notice that we need to connect it to our GitHub to allow us to integrate it with our pipelines and start pushing code. Click the 'Update pending connection' to integrate with your GitHub account.

CodeConnection

Once that is done, you need to create empty GitHub repositories to start pushing code to. For example, you can create a repository called "ml-admin-portfolio-repo". Every project you deploy will need a repository created in GitHub beforehand.

We recommend to create a separate folder for the differnt repositories that will be created in the platform. To do that, get out of the cloned repository and create a parallel folder called platform-repositories

cd ../../.. # (as many .. as directories you have moved in)

Let´s clone and fill the empty created repository

cd platform-repositories
git clone https://github.com/example-org/ml-admin-service-catalog-repo.git
cd ml-admin-service-catalog-repo
cp -aR ../../ml-platform-shared-services/module-3/ml-admin-portfolio/. .

Let's push the code to the GitHub Repository to create the Service Catalog portfolio. Run the code below.

git add .
git commit -m "Initial commit"
git push -u origin main

Once it is pushed, let's go back to the GitHub repository we created earlier. Now it's no longer empty. Once the code is pushed to the code repository, it triggers the CodePipeline run to build and deploy artifacts to the Service Catalog. Click Pipelines -> Pipeline to check it out. You will see a pipeline named "cdk-service-catalog-pipeline". Click on the pipeline name to check out the steps of it. For more information, read AWS CodePipeline.

CodePipeline

It takes about 10 minutes for the pipeline to finish running. Once it's finished, let's check out the Service Catalog Portfolios.

Type "Service Catalog" in the search bar and click on "Service Catalog"

On the Service Catalog page, click "Portfolio" under "Administration". You will see a portfolio named "ML Admins Portfolio".

Service Catalog Portfolio

A portfolio is composed of products. A product is a set of AWS cloud resources that you want to make available for deployment on AWS. Click on one of the products, and then click on the version name, you can see what's inside the product is mainly a CloudFormation template, which allows you to deploy infrastructure as code. For more information about CloudFormation templates, read AWS CloudFormation.

FAQ and Common Errors

Common Errors

  • CDK Version:

Error: This CDK CLI is not compatible with the CDK library used by your application. Please upgrade the CLI to the latest version. (Cloud assembly schema version mismatch: Maximum schema version supported is 34.0.0, but found 35.0.0)

This error happens when the CDK Cli version and the Virtual Environment aws-cdk-lib package version is not the same.

To check both of them run cdk --version for the CDK Cli and pip list for the aws-cdk-lib python package.

How to solve?: You can either modify the cdk cli version or the pip version.

  • CodeBuild concurrent runs

Error:

Error calling startBuild: Cannot have more than 1 builds in queue for the account (Service: AWSCodeBuild; Status Code: 400; Error Code: AccountLimitExceededException; Request ID: xxxxx; Proxy: null)

This error happens because the given quota for our CodeBuild Environments is lower than the one required for the concurrent build of the Service Catalog Portfolio Products.

How to solve?: See the following Article and request a quota increase as specified in: Requesting a quota increase

Quota Increase