diff --git a/SageMaker/Linear_example.ipynb b/SageMaker/Linear_example.ipynb deleted file mode 100644 index 3a96356..0000000 --- a/SageMaker/Linear_example.ipynb +++ /dev/null @@ -1,371 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Comet.ml: Sagemaker Linear Learner Introduction Integration\n", - "\n", - "The code below is taken directly from Amazon Sagemaker's official [An Introduction to Linear Learner with MNIST](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/linear_learner_mnist/linear_learner_mnist.ipynb) notebook.\n", - "\n", - "The descriptive text has more or less been removed, but the code is identical. \n", - "\n", - "Follow along below to learn how to log Sagemaker training jobs to Comet.ml." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Install the comet_ml_sagemaker python package\n", - "\n", - "Comet's SageMaker configuration is available to Enterprise customers only. If you are interested in learning more about Comet Enterprise, or are in a trial period with Comet.ml and would like to evaluate the SageMaker integration, please email support@comet.ml and credentials can be shared to download the correct packages." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Prerequisites and Preprocessing\n", - "#### Permissions and Environment Variables" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "bucket = \"NAME_YOUR_BUCKET\"\n", - "prefix = \"sagemaker/DEMO-linear-mnist\"\n", - "\n", - "# Define IAM role\n", - "import boto3\n", - "import re\n", - "from sagemaker import get_execution_role\n", - "\n", - "role = get_execution_role()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Data Ingestion" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%%time\n", - "import pickle, gzip, numpy, urllib.request, json\n", - "\n", - "# Load the dataset\n", - "urllib.request.urlretrieve(\n", - " \"http://deeplearning.net/data/mnist/mnist.pkl.gz\", \"mnist.pkl.gz\"\n", - ")\n", - "with gzip.open(\"mnist.pkl.gz\", \"rb\") as f:\n", - " train_set, valid_set, test_set = pickle.load(f, encoding=\"latin1\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Data Inspection" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "import matplotlib.pyplot as plt\n", - "\n", - "plt.rcParams[\"figure.figsize\"] = (2, 10)\n", - "\n", - "\n", - "def show_digit(img, caption=\"\", subplot=None):\n", - " if subplot == None:\n", - " _, (subplot) = plt.subplots(1, 1)\n", - " imgr = img.reshape((28, 28))\n", - " subplot.axis(\"off\")\n", - " subplot.imshow(imgr, cmap=\"gray\")\n", - " plt.title(caption)\n", - "\n", - "\n", - "show_digit(train_set[0][30], \"This is a {}\".format(train_set[1][30]))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Data Conversion" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import io\n", - "import numpy as np\n", - "import sagemaker.amazon.common as smac\n", - "\n", - "vectors = np.array([t.tolist() for t in train_set[0]]).astype(\"float32\")\n", - "labels = np.where(np.array([t.tolist() for t in train_set[1]]) == 0, 1, 0).astype(\n", - " \"float32\"\n", - ")\n", - "\n", - "buf = io.BytesIO()\n", - "smac.write_numpy_to_dense_tensor(buf, vectors, labels)\n", - "buf.seek(0)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Upload Training Data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import boto3\n", - "import os\n", - "\n", - "key = \"recordio-pb-data\"\n", - "boto3.resource(\"s3\").Bucket(bucket).Object(\n", - " os.path.join(prefix, \"train\", key)\n", - ").upload_fileobj(buf)\n", - "s3_train_data = \"s3://{}/{}/train/{}\".format(bucket, prefix, key)\n", - "print(\"uploaded training data location: {}\".format(s3_train_data))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Set up output S3 location for the model artifact that will be output as the result of training with the algorithm" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "output_location = \"s3://{}/{}/output\".format(bucket, prefix)\n", - "print(\"training artifacts will be uploaded to: {}\".format(output_location))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Training the Linear Model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sagemaker.amazon.amazon_estimator import get_image_uri\n", - "\n", - "container = get_image_uri(boto3.Session().region_name, \"linear-learner\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import boto3\n", - "import sagemaker\n", - "\n", - "sess = sagemaker.Session()\n", - "\n", - "linear = sagemaker.estimator.Estimator(\n", - " container,\n", - " role,\n", - " train_instance_count=1,\n", - " train_instance_type=\"ml.c4.xlarge\",\n", - " output_path=output_location,\n", - " sagemaker_session=sess,\n", - ")\n", - "linear.set_hyperparameters(\n", - " feature_dim=784, predictor_type=\"binary_classifier\", mini_batch_size=200\n", - ")\n", - "\n", - "linear.fit({\"train\": s3_train_data})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Logging to Comet.ml" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Define your Comet [REST API](https://www.comet.com/docs/rest-api/getting-started/) and your [workspace](https://www.comet.com/docs/user-interface/#workspaces). See the [configuration documentation](http://docs.comet.ml/python-sdk/advanced/#python-configuration) for info on both specifications." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "COMET_REST_API = \"YOUR_API_KEY\"\n", - "COMET_WORKSPACE = \"YOUR_WORKSPACE\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Import `comet_ml_sagemaker` package." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import comet_ml_sagemaker" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### comet_ml_sagemaker.log_sagemaker_job(estimator/regressor, api_key, workspace, project_name)\n", - "Logs a Sagemaker job based on an estimator/regressor object \n", - "\n", - "* estimator/regressor = Sagemaker estimator/regressor object\n", - "* api_key = your Comet REST API key\n", - "* workspace = your Comet workspace\n", - "* project_name = your Comet project_name" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# .log_sagemaker_job(regressor/estimator object from Sagemaker SDK, Comet Rest API key (optional, can be taken from usual config source), workspace (comet), project (comet))\n", - "# I have used the Sagemaker SDK to train a model. I have the estimator/regressor object. I want to log whatever I just trained\n", - "experiment = comet_ml_sagemaker.log_sagemaker_job(\n", - " linear, api_key=COMET_REST_API, workspace=COMET_WORKSPACE, project_name=\"sagemaker\"\n", - ")\n", - "print(experiment.url)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### comet_ml_sagemaker.log_sagemaker_job_by_name(job_name, api_key, workspace, project_name)\n", - "Logs a specific Sagemaker training job based on the jobname from the Sagemaker SDK.\n", - "\n", - "* job_name = Cloudwatch/Sagemaker training job name\n", - "* api_key = your Comet REST API key\n", - "* workspace = your Comet workspace\n", - "* project_name = your Comet project_name" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# I have the name of a completed training job I want to lob\n", - "# Same as .log_sagemaker_job, except instead of passing the regressor/estimator object, you pass the job name\n", - "SAGEMAKER_TRAINING_JOB_NAME = \"SAGEMAKER_TRAINING_JOB_NAME\"\n", - "experiment = comet_ml_sagemaker.log_sagemaker_job_by_name(\n", - " SAGEMAKER_TRAINING_JOB_NAME,\n", - " api_key=COMET_REST_API,\n", - " workspace=COMET_WORKSPACE,\n", - " project_name=\"sagemaker\",\n", - ")\n", - "print(experiment.url)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### comet_ml_sagemaker.log_last_sagemaker_job(api_key, workspace, project_name)\n", - "Will log the last *started* Sagemaker training job based on the current config.\n", - "\n", - "* api_key = your Comet REST API key\n", - "* workspace = your Comet workspace\n", - "* project_name = your Comet project_name" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Logs the last job for your current Amazon Region / S3\n", - "experiment = comet_ml_sagemaker.log_last_sagemaker_job(\n", - " api_key=COMET_REST_API, workspace=COMET_WORKSPACE, project_name=\"sagemaker\"\n", - ")\n", - "print(experiment.url)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Note on SageMaker configuration\n", - "\n", - "The Comet.ml Sagemaker configuration is using boto to find your training job data, please refer to the [boto documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html) to configure the region and/or credentials if needed." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.9" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/SageMaker/README.md b/SageMaker/README.md deleted file mode 100644 index 2d4ad73..0000000 --- a/SageMaker/README.md +++ /dev/null @@ -1,83 +0,0 @@ -Drawing - -## SageMaker Integration with Comet.ml - -Comet's SageMaker integration is available to Enterprise customers only. If you are interested in learning more about Comet Enterprise, or are in a trial period with Comet.ml and would like to evaluate the SageMaker integration, please email support@comet.ml and credentials can be shared to download the correct packages. - -## Examples Repository - -This repository contains examples of using Comet.ml with SageMaker built-in Algorithms Linear Learner and Random Cut Forests. - - -## Documentation - -Full [documentation](http://www.comet.ml/docs/) and additional training examples are available on our website. - - -## Installation - -Please contact us for installation instructions. - -## Configuration - -The SageMaker integration is following the [Comet.ml Python SDK configuration](http://docs.comet.ml/python-sdk/advanced/#python-configuration) for configuring your Rest API Key, your workspace and project_name for created experiments. It's also following the [Boto configuration](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html) to find your SageMaker training jobs. - -## Logging SageMaker training runs to Comet - -Below find three different ways you can log your SageMaker jobs to Comet: with an existing regressor/estimator object, with a SageMaker Job Name, or with the last SageMaker job. - -*** - -### comet_ml_sagemaker.log_sagemaker_job - -`log_sagemaker_job(sagemaker_object, api_key, workspace, project_name)` - -Logs a Sagemaker job based on an estimator/regressor object - -* **estimator/regressor** = Sagemaker estimator/regressor object -* **api_key** = your Comet REST API key -* **workspace** = your Comet workspace -* **project_name** = your Comet project_name - -*** - -### comet_ml_sagemaker.log_sagemaker_job_by_name - -`log_sagemaker_job_by_name(job_name, api_key, workspace, project_name)` - -Logs a specific Sagemaker training job based on the jobname from the Sagemaker SDK. - -* **job_name** = Cloudwatch/Sagemaker training job name -* **api_key** = your Comet REST API key -* **workspace** = your Comet workspace -* **project_name** = your Comet project_name - -*** - -### comet_ml_sagemaker.log_last_sagemaker_job - -`log_last_sagemaker_job(api_key, workspace, project_name)` - -Will log the last *started* Sagemaker training job based on the current config. - -* **api_key** = your Comet REST API key -* **workspace** = your Comet workspace -* **project_name** = your Comet project_name - -*** - -## Tutorials + Examples -- [Linear Learner](Linear_example.ipynb) -- [Random Cut Forests](random_forest.ipynb) - - -## Support -Have questions? We have answers - -- Try checking our [FAQ Page](https://www.comet.ml/faq) -- Email us at -- For the fastest response, ping us on [Slack](https://join.slack.com/t/cometml/shared_invite/enQtMzM0OTMwNTQ0Mjc5LTM4ZDViODkyYTlmMTVlNWY0NzFjNGQ5Y2Q1Y2EwMjQ5MzQ4YmI2YjhmZTY3YmYxYTYxYTNkYzM4NjgxZmJjMDI) - - -## Feature Spotlight -Check out new product features and updates through our [Release Notes](https://www.notion.so/cometml/Comet-ml-Release-Notes-93d864bcac584360943a73ae9507bcaa). Also checkout our articles on [Medium](https://medium.com/comet-ml). - diff --git a/SageMaker/random_forest.ipynb b/SageMaker/random_forest.ipynb deleted file mode 100644 index c7bd49b..0000000 --- a/SageMaker/random_forest.ipynb +++ /dev/null @@ -1,283 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Comet.ml: Sagemaker Random Cut Forests Introduction Integration\n", - "\n", - "The code below is taken directly from Amazon Sagemaker's official [An Introduction to SageMaker Random Cut Forests](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/random_cut_forest/random_cut_forest.ipynb) notebook.\n", - "\n", - "The descriptive text has more or less been removed, but the code is identical. \n", - "\n", - "Follow along below to learn how to log Sagemaker training jobs to Comet.ml." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Install the comet_ml_sagemaker python package\n", - "\n", - "Comet's SageMaker configuration is available to Enterprise customers only. If you are interested in learning more about Comet Enterprise, or are in a trial period with Comet.ml and would like to evaluate the SageMaker integration, please email support@comet.ml and credentials can be shared to download the correct packages." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Select Amazon S3 Bucket" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import boto3\n", - "import botocore\n", - "import sagemaker\n", - "import sys\n", - "\n", - "\n", - "bucket = \"NAME_YOUR_BUCKET\" # <--- specify a bucket you have access to\n", - "prefix = \"sagemaker/rcf-benchmarks\"\n", - "execution_role = sagemaker.get_execution_role()\n", - "\n", - "\n", - "# check if the bucket exists\n", - "try:\n", - " boto3.Session().client(\"s3\").head_bucket(Bucket=bucket)\n", - "except botocore.exceptions.ParamValidationError as e:\n", - " print(\n", - " \"Hey! You either forgot to specify your S3 bucket\"\n", - " \" or you gave your bucket an invalid name!\"\n", - " )\n", - "except botocore.exceptions.ClientError as e:\n", - " if e.response[\"Error\"][\"Code\"] == \"403\":\n", - " print(\"Hey! You don't have permission to access the bucket, {}.\".format(bucket))\n", - " elif e.response[\"Error\"][\"Code\"] == \"404\":\n", - " print(\"Hey! Your bucket, {}, doesn't exist!\".format(bucket))\n", - " else:\n", - " raise\n", - "else:\n", - " print(\"Training input/output will be stored in: s3://{}/{}\".format(bucket, prefix))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Obtain and Inspect Example Data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%%time\n", - "\n", - "import pandas as pd\n", - "import urllib.request\n", - "\n", - "data_filename = \"nyc_taxi.csv\"\n", - "data_source = \"https://raw.githubusercontent.com/numenta/NAB/master/data/realKnownCause/nyc_taxi.csv\"\n", - "\n", - "urllib.request.urlretrieve(data_source, data_filename)\n", - "taxi_data = pd.read_csv(data_filename, delimiter=\",\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Training\n", - "\n", - "#### Hyperparameters" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sagemaker import RandomCutForest\n", - "\n", - "session = sagemaker.Session()\n", - "\n", - "# specify general training job information\n", - "rcf = RandomCutForest(\n", - " role=execution_role,\n", - " train_instance_count=1,\n", - " train_instance_type=\"ml.m4.xlarge\",\n", - " data_location=\"s3://{}/{}/\".format(bucket, prefix),\n", - " output_path=\"s3://{}/{}/output\".format(bucket, prefix),\n", - " num_samples_per_tree=512,\n", - " num_trees=50,\n", - ")\n", - "\n", - "# automatically upload the training data to S3 and run the training job\n", - "rcf.fit(rcf.record_set(taxi_data.value.as_matrix().reshape(-1, 1)))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Logging to Comet.ml" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Define your Comet [REST API](https://www.comet.com/docs/rest-api/getting-started/) and your [workspace](https://www.comet.com/docs/user-interface/#workspaces). See the [configuration documentation](http://docs.comet.ml/python-sdk/advanced/#python-configuration) for info on both specifications." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "COMET_REST_API = \"YOUR_API_KEY\"\n", - "COMET_WORKSPACE = \"YOUR_WORKSPACE\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Import `comet_ml_sagemaker` package." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import comet_ml_sagemaker" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### comet_ml_sagemaker.log_sagemaker_job(estimator/regressor, api_key, workspace, project_name)\n", - "Logs a Sagemaker job based on an estimator/regressor object \n", - "\n", - "* estimator/regressor = Sagemaker estimator/regressor object\n", - "* api_key = your Comet REST API key\n", - "* workspace = your Comet workspace\n", - "* project_name = your Comet project_name" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# .log_sagemaker_job(regressor/estimator object from Sagemaker SDK, Comet Rest API key (optional, can be taken from usual config source), workspace (comet), project (comet))\n", - "# I have used the Sagemaker SDK to train a model. I have the estimator/regressor object. I want to log whatever I just trained\n", - "experiment = comet_ml_sagemaker.log_sagemaker_job(\n", - " rcf, api_key=COMET_REST_API, workspace=COMET_WORKSPACE, project_name=\"sagemaker\"\n", - ")\n", - "print(experiment.url)\n", - "experiment.add_tags([\"random_forest\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### comet_ml_sagemaker.log_sagemaker_job_by_name(job_name, api_key, workspace, project_name)\n", - "Logs a specific Sagemaker training job based on the jobname from the Sagemaker SDK.\n", - "\n", - "* job_name = Cloudwatch/Sagemaker training job name\n", - "* api_key = your Comet REST API key\n", - "* workspace = your Comet workspace\n", - "* project_name = your Comet project_name" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# I have the name of a completed training job I want to lob\n", - "# Same as .log_sagemaker_job, except instead of passing the regressor/estimator object, you pass the job name\n", - "SAGEMAKER_TRAINING_JOB_NAME = \"SAGEMAKER_TRAINING_JOB_NAME\"\n", - "experiment = comet_ml_sagemaker.log_sagemaker_job_by_name(\n", - " SAGEMAKER_TRAINING_JOB_NAME,\n", - " api_key=COMET_REST_API,\n", - " workspace=COMET_WORKSPACE,\n", - " project_name=\"sagemaker\",\n", - ")\n", - "print(experiment.url)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### comet_ml_sagemaker.log_last_sagemaker_job(api_key, workspace, project_name)\n", - "Will log the last *started* Sagemaker training job based on the current config.\n", - "\n", - "* api_key = your Comet REST API key\n", - "* workspace = your Comet workspace\n", - "* project_name = your Comet project_name" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Logs the last job for your current Amazon Region / S3\n", - "experiment = comet_ml_sagemaker.log_last_sagemaker_job(\n", - " api_key=COMET_REST_API, workspace=COMET_WORKSPACE, project_name=\"sagemaker\"\n", - ")\n", - "print(experiment.url)\n", - "experiment.add_tags([\"random_forest\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Note on SageMaker configuration\n", - "\n", - "The Comet.ml Sagemaker configuration is using boto to find your training job data, please refer to the [boto documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html) to configure the region and/or credentials if needed." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.9" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -}