aws
diff --git a/‎README.md‎
Lines changed: 9 additions & 25 deletions b/‎README.md‎
Lines changed: 9 additions & 25 deletions
diff --git a/‎src/hyperpod_cli/sagemaker_hyperpod_recipes‎
Lines changed: 1 addition & 0 deletions b/‎src/hyperpod_cli/sagemaker_hyperpod_recipes‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎src/sagemaker/hyperpod/cli/commands/inference.py‎
Lines changed: 9 additions & 1 deletion b/‎src/sagemaker/hyperpod/cli/commands/inference.py‎
Lines changed: 9 additions & 1 deletion
diff --git a/‎test/integration_tests/cli/test_cli_custom_fsx_inference.py‎
Lines changed: 140 additions & 0 deletions b/‎test/integration_tests/cli/test_cli_custom_fsx_inference.py‎
Lines changed: 140 additions & 0 deletions
diff --git a/‎test/integration_tests/cli/test_cli_custom_s3_inference.py‎
Lines changed: 140 additions & 0 deletions b/‎test/integration_tests/cli/test_cli_custom_s3_inference.py‎
Lines changed: 140 additions & 0 deletions
diff --git a/‎test/integration_tests/conftest.py‎
Lines changed: 1 addition & 1 deletion b/‎test/integration_tests/conftest.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎…n_tests/cli/test_cli_custom_inference.py‎ ‎…ference/cli/test_cli_custom_inference.py‎test/integration_tests/cli/test_cli_custom_inference.py renamed to test/integration_tests/inference/cli/test_cli_custom_inference.py b/‎…n_tests/cli/test_cli_custom_inference.py‎ ‎…ference/cli/test_cli_custom_inference.py‎test/integration_tests/cli/test_cli_custom_inference.py renamed to test/integration_tests/inference/cli/test_cli_custom_inference.py
diff --git a/‎…ests/cli/test_cli_jumpstart_inference.py‎ ‎…ence/cli/test_cli_jumpstart_inference.py‎test/integration_tests/cli/test_cli_jumpstart_inference.py renamed to test/integration_tests/inference/cli/test_cli_jumpstart_inference.py b/‎…ests/cli/test_cli_jumpstart_inference.py‎ ‎…ence/cli/test_cli_jumpstart_inference.py‎test/integration_tests/cli/test_cli_jumpstart_inference.py renamed to test/integration_tests/inference/cli/test_cli_jumpstart_inference.py
diff --git a/‎…on_tests/sdk/test_sdk_custom_inferece.py‎ ‎…nference/sdk/test_sdk_custom_inferece.py‎test/integration_tests/sdk/test_sdk_custom_inferece.py renamed to test/integration_tests/inference/sdk/test_sdk_custom_inferece.py b/‎…on_tests/sdk/test_sdk_custom_inferece.py‎ ‎…nference/sdk/test_sdk_custom_inferece.py‎test/integration_tests/sdk/test_sdk_custom_inferece.py renamed to test/integration_tests/inference/sdk/test_sdk_custom_inferece.py
diff --git a/‎…ests/sdk/test_sdk_jumpstart_inference.py‎ ‎…ence/sdk/test_sdk_jumpstart_inference.py‎test/integration_tests/sdk/test_sdk_jumpstart_inference.py renamed to test/integration_tests/inference/sdk/test_sdk_jumpstart_inference.py b/‎…ests/sdk/test_sdk_jumpstart_inference.py‎ ‎…ence/sdk/test_sdk_jumpstart_inference.py‎test/integration_tests/sdk/test_sdk_jumpstart_inference.py renamed to test/integration_tests/inference/sdk/test_sdk_jumpstart_inference.py
@@ -5,6 +5,8 @@ The Amazon SageMaker HyperPod command-line interface (HyperPod CLI) is a tool th
 
 This documentation serves as a reference for the available HyperPod CLI commands. For a comprehensive user guide, see [Orchestrating SageMaker HyperPod clusters with Amazon EKS](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-hyperpod-eks.html) in the *Amazon SageMaker Developer Guide*.
 
+Note: Old `hyperpod`CLI V2 has been moved to `release_v2` branch. Please refer [release_v2 branch](https://github.com/aws/sagemaker-hyperpod-cli/tree/release_v2) for usage.
+
 ## Table of Contents
 - [Overview](#overview)
 - [Prerequisites](#prerequisites)
@@ -21,8 +23,8 @@ This documentation serves as a reference for the available HyperPod CLI commands
     - [Training](#training-)
     - [Inference](#inference-)
   - [SDK](#sdk-)
-    - [Training](#training-)
-    - [Inference](#inference)
+    - [Training](#training-sdk)
+    - [Inference](#inference-sdk)
 
 
 ## Overview
@@ -72,27 +74,9 @@ SageMaker HyperPod CLI currently supports start training job with:
 1. Verify if the installation succeeded by running the following command.
 
     ```
-    hyperpod --help
+    hyp --help
     ```
 
-1. If you have a running HyperPod cluster, you can try to run a training job using the sample configuration file provided at ```/examples/basic-job-example-config.yaml```.
-    - Get your HyperPod clusters to show their capacities.
-      ```
-      hyperpod get-clusters
-      ```
-    - Get your HyperPod clusters to show their capacities and quota allocation info for a team.
-      ```
-      hyperpod get-clusters -n hyperpod-ns-<team-name>
-      ```
-    - Connect to one HyperPod cluster and specify a namespace you have access to.
-      ```
-      hyperpod connect-cluster --cluster-name <cluster-name>
-      ```
-    - Start a job in your cluster. Change the `instance_type` in the yaml file to be same as the one in your HyperPod cluster. Also change the `namespace` you want to submit a job to, the example uses kubeflow namespace. You need to have installed PyTorch in your cluster.
-      ```
-      hyperpod start-job --config-file ./examples/basic-job-example-config.yaml
-      ```
-
 ## Usage
 
 The HyperPod CLI provides the following commands:
@@ -106,8 +90,8 @@ The HyperPod CLI provides the following commands:
   - [Training](#training-)
   - [Inference](#inference-)
 - [SDK](#sdk-)
-  - [Training](#training-)
-  - [Inference](#inference)
+  - [Training](#training-sdk)
+  - [Inference](#inference-sdk)
 
 
 ### Getting Cluster information
@@ -267,7 +251,7 @@ hyp delete hyp-jumpstart-endpoint --name endpoint-jumpstart
 
 Along with the CLI, we also have SDKs available that can perform the training and inference functionalities that the CLI performs
 
-### Training 
+### Training SDK
 
 #### Creating a Training Job 
 
@@ -342,7 +326,7 @@ pytorch_job.create()
 
 
 
-### Inference
+### Inference SDK
 
 #### Creating a JumpstartModel Endpoint
 
 
@@ -0,0 +1 @@
+Subproject commit ce96b513c3033f815d24469f07e2ef0531aaf8d4
@@ -69,9 +69,17 @@ def custom_create(namespace, version, custom_endpoint):
     required=True,
     help="Required. The body of the request to invoke.",
 )
+@click.option(
+    "--content-type",
+    type=click.STRING,
+    required=False,
+    default="application/json",
+    help="Optional. The content type of the request to invoke. Default set to 'application/json'",
+)
 def custom_invoke(
     endpoint_name: str,
     body: str,
+    content_type: Optional[str]
 ):
     """
     Invoke a model endpoint.
@@ -105,7 +113,7 @@ def custom_invoke(
     resp = rt.invoke_endpoint(
         EndpointName=endpoint_name,
         Body=payload.encode("utf-8"),
-        ContentType="application/json",
+        ContentType=content_type,
     )
     result = resp["Body"].read().decode("utf-8")
     click.echo(result)
 
@@ -0,0 +1,140 @@
+import time
+import uuid
+import pytest
+import boto3
+import os
+from click.testing import CliRunner
+from sagemaker.hyperpod.cli.commands.inference import (
+    custom_create, 
+    custom_invoke,
+    custom_list,
+    custom_describe,
+    custom_delete,
+    custom_get_operator_logs,
+    custom_list_pods
+)
+from sagemaker.hyperpod.inference.hp_endpoint import HPEndpoint
+
+# --------- Test Configuration ---------
+NAMESPACE = "integration"
+VERSION = "1.0"
+REGION = "us-east-2"
+TIMEOUT_MINUTES = 15
+POLL_INTERVAL_SECONDS = 30
+
+BETA_FSX = "fs-0454e783bbb7356fc"
+PROD_FSX = "fs-03c59e2a7e824a22f"
+BETA_TLS = "s3://sagemaker-hyperpod-certificate-beta-us-east-2"
+PROD_TLS = "s3://sagemaker-hyperpod-certificate-prod-us-east-2"
+stage = os.getenv("STAGE", "BETA").upper()
+FSX_LOCATION = BETA_FSX if stage == "BETA" else PROD_FSX
+TLS_LOCATION = BETA_TLS if stage == "BETA" else PROD_TLS
+
+@pytest.fixture(scope="module")
+def runner():
+    return CliRunner()
+
+@pytest.fixture(scope="module")
+def custom_endpoint_name():
+    return f"custom-cli-integration-fsx"
+
+@pytest.fixture(scope="module")
+def sagemaker_client():
+    return boto3.client("sagemaker", region_name=REGION)
+
+# --------- Custom Endpoint Tests ---------
+
+def test_custom_create(runner, custom_endpoint_name):
+    result = runner.invoke(custom_create, [
+        "--namespace", NAMESPACE,
+        "--version", VERSION,
+        "--instance-type", "ml.c5.2xlarge",
+        "--model-name", "test-model-integration-cli-fsx",
+        "--model-source-type", "fsx",
+        "--model-location", "hf-eqa",
+        "--fsx-file-system-id", FSX_LOCATION,
+        "--s3-region", REGION,
+        "--image-uri", "763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:2.3.0-transformers4.48.0-cpu-py311-ubuntu22.04",
+        "--container-port", "8080",
+        "--model-volume-mount-name", "model-weights",
+        "--endpoint-name", custom_endpoint_name,
+        "--resources-requests", '{"cpu": "3200m", "nvidia.com/gpu": 0, "memory": "12Gi"}',
+        "--resources-limits", '{"nvidia.com/gpu": 0}',
+        "--tls-certificate-output-s3-uri", TLS_LOCATION,
+        "--env", '{ "SAGEMAKER_PROGRAM": "inference.py", "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code", "SAGEMAKER_CONTAINER_LOG_LEVEL": "20", "SAGEMAKER_MODEL_SERVER_TIMEOUT": "3600", "ENDPOINT_SERVER_TIMEOUT": "3600", "MODEL_CACHE_ROOT": "/opt/ml/model", "SAGEMAKER_ENV": "1", "SAGEMAKER_MODEL_SERVER_WORKERS": "1" }'
+    ])
+    assert result.exit_code == 0, result.output
+
+
+def test_custom_list(runner, custom_endpoint_name):
+    result = runner.invoke(custom_list, ["--namespace", NAMESPACE])
+    assert result.exit_code == 0
+    assert custom_endpoint_name in result.output
+
+
+def test_custom_describe(runner, custom_endpoint_name):
+    result = runner.invoke(custom_describe, [
+        "--name", custom_endpoint_name,
+        "--namespace", NAMESPACE,
+        "--full"
+    ])
+    assert result.exit_code == 0
+    assert custom_endpoint_name in result.output
+
+
+def test_wait_until_inservice(custom_endpoint_name):
+    """Poll SDK until specific JumpStart endpoint reaches DeploymentComplete"""
+    print(f"[INFO] Waiting for JumpStart endpoint '{custom_endpoint_name}' to be DeploymentComplete...")
+    deadline = time.time() + (TIMEOUT_MINUTES * 60)
+    poll_count = 0
+
+    while time.time() < deadline:
+        poll_count += 1
+        print(f"[DEBUG] Poll #{poll_count}: Checking endpoint status...")
+
+        try:
+            ep = HPEndpoint.get(name=custom_endpoint_name, namespace=NAMESPACE)
+            state = ep.status.endpoints.sagemaker.state
+            print(f"[DEBUG] Current state: {state}")
+            if state == "CreationCompleted":
+                print("[INFO] Endpoint is in CreationCompleted state.")
+                return
+            
+            deployment_state = ep.status.deploymentStatus.deploymentObjectOverallState
+            if deployment_state == "DeploymentFailed":
+                pytest.fail("Endpoint deployment failed.")
+
+        except Exception as e:
+            print(f"[ERROR] Exception during polling: {e}")
+
+        time.sleep(POLL_INTERVAL_SECONDS)
+
+    pytest.fail("[ERROR] Timed out waiting for endpoint to be DeploymentComplete")
+
+
+def test_custom_invoke(runner, custom_endpoint_name):
+    result = runner.invoke(custom_invoke, [
+        "--endpoint-name", custom_endpoint_name,
+        "--body", '{"question" :"what is the name of the planet?", "context":"mars"}',
+        "--content-type", "application/list-text"
+    ])
+    assert result.exit_code == 0
+    assert "error" not in result.output.lower()
+
+
+def test_custom_get_operator_logs(runner):
+    result = runner.invoke(custom_get_operator_logs, ["--since-hours", "1"])
+    assert result.exit_code == 0
+
+
+def test_custom_list_pods(runner):
+    result = runner.invoke(custom_list_pods, ["--namespace", NAMESPACE])
+    assert result.exit_code == 0
+    
+
+def test_custom_delete(runner, custom_endpoint_name):
+    result = runner.invoke(custom_delete, [
+        "--name", custom_endpoint_name,
+        "--namespace", NAMESPACE
+    ])
+    assert result.exit_code == 0
@@ -0,0 +1,140 @@
+import time
+import uuid
+import pytest
+import boto3
+import os
+from click.testing import CliRunner
+from sagemaker.hyperpod.cli.commands.inference import (
+    custom_create, 
+    custom_invoke,
+    custom_list,
+    custom_describe,
+    custom_delete,
+    custom_get_operator_logs,
+    custom_list_pods
+)
+from sagemaker.hyperpod.inference.hp_endpoint import HPEndpoint
+
+# --------- Test Configuration ---------
+NAMESPACE = "integration"
+VERSION = "1.0"
+REGION = "us-east-2"
+TIMEOUT_MINUTES = 15
+POLL_INTERVAL_SECONDS = 30
+
+BETA_BUCKET = "sagemaker-hyperpod-beta-integ-test-model-bucket-n"
+PROD_BUCKET = "sagemaker-hyperpod-prod-integ-test-model-bucket"
+BETA_TLS = "s3://sagemaker-hyperpod-certificate-beta-us-east-2"
+PROD_TLS = "s3://sagemaker-hyperpod-certificate-prod-us-east-2"
+stage = os.getenv("STAGE", "BETA").upper()
+BUCKET_LOCATION = BETA_BUCKET if stage == "BETA" else PROD_BUCKET
+TLS_LOCATION = BETA_TLS if stage == "BETA" else PROD_TLS
+
+@pytest.fixture(scope="module")
+def runner():
+    return CliRunner()
+
+@pytest.fixture(scope="module")
+def custom_endpoint_name():
+    return f"custom-cli-integration-s3"
+
+@pytest.fixture(scope="module")
+def sagemaker_client():
+    return boto3.client("sagemaker", region_name=REGION)
+
+# --------- Custom Endpoint Tests ---------
+
+def test_custom_create(runner, custom_endpoint_name):
+    result = runner.invoke(custom_create, [
+        "--namespace", NAMESPACE,
+        "--version", VERSION,
+        "--instance-type", "ml.c5.2xlarge",
+        "--model-name", "test-model-integration-cli-s3",
+        "--model-source-type", "s3",
+        "--model-location", "hf-eqa",
+        "--s3-bucket-name", BUCKET_LOCATION,
+        "--s3-region", REGION,
+        "--image-uri", "763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:2.3.0-transformers4.48.0-cpu-py311-ubuntu22.04",
+        "--container-port", "8080",
+        "--model-volume-mount-name", "model-weights",
+        "--endpoint-name", custom_endpoint_name,
+        "--resources-requests", '{"cpu": "3200m", "nvidia.com/gpu": 0, "memory": "12Gi"}',
+        "--resources-limits", '{"nvidia.com/gpu": 0}',
+        "--tls-certificate-output-s3-uri", TLS_LOCATION,
+        "--env", '{ "SAGEMAKER_PROGRAM": "inference.py", "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code", "SAGEMAKER_CONTAINER_LOG_LEVEL": "20", "SAGEMAKER_MODEL_SERVER_TIMEOUT": "3600", "ENDPOINT_SERVER_TIMEOUT": "3600", "MODEL_CACHE_ROOT": "/opt/ml/model", "SAGEMAKER_ENV": "1", "SAGEMAKER_MODEL_SERVER_WORKERS": "1" }'
+    ])
+    assert result.exit_code == 0, result.output
+
+
+def test_custom_list(runner, custom_endpoint_name):
+    result = runner.invoke(custom_list, ["--namespace", NAMESPACE])
+    assert result.exit_code == 0
+    assert custom_endpoint_name in result.output
+
+
+def test_custom_describe(runner, custom_endpoint_name):
+    result = runner.invoke(custom_describe, [
+        "--name", custom_endpoint_name,
+        "--namespace", NAMESPACE,
+        "--full"
+    ])
+    assert result.exit_code == 0
+    assert custom_endpoint_name in result.output
+
+
+def test_wait_until_inservice(custom_endpoint_name):
+    """Poll SDK until specific JumpStart endpoint reaches DeploymentComplete"""
+    print(f"[INFO] Waiting for JumpStart endpoint '{custom_endpoint_name}' to be DeploymentComplete...")
+    deadline = time.time() + (TIMEOUT_MINUTES * 60)
+    poll_count = 0
+
+    while time.time() < deadline:
+        poll_count += 1
+        print(f"[DEBUG] Poll #{poll_count}: Checking endpoint status...")
+
+        try:
+            ep = HPEndpoint.get(name=custom_endpoint_name, namespace=NAMESPACE)
+            state = ep.status.endpoints.sagemaker.state
+            print(f"[DEBUG] Current state: {state}")
+            if state == "CreationCompleted":
+                print("[INFO] Endpoint is in CreationCompleted state.")
+                return
+            
+            deployment_state = ep.status.deploymentStatus.deploymentObjectOverallState
+            if deployment_state == "DeploymentFailed":
+                pytest.fail("Endpoint deployment failed.")
+
+        except Exception as e:
+            print(f"[ERROR] Exception during polling: {e}")
+
+        time.sleep(POLL_INTERVAL_SECONDS)
+
+    pytest.fail("[ERROR] Timed out waiting for endpoint to be DeploymentComplete")
+
+
+def test_custom_invoke(runner, custom_endpoint_name):
+    result = runner.invoke(custom_invoke, [
+        "--endpoint-name", custom_endpoint_name,
+        "--body", '{"question" :"what is the name of the planet?", "context":"mars"}',
+        "--content-type", "application/list-text"
+    ])
+    assert result.exit_code == 0
+    assert "error" not in result.output.lower()
+
+
+def test_custom_get_operator_logs(runner):
+    result = runner.invoke(custom_get_operator_logs, ["--since-hours", "1"])
+    assert result.exit_code == 0
+
+
+def test_custom_list_pods(runner):
+    result = runner.invoke(custom_list_pods, ["--namespace", NAMESPACE])
+    assert result.exit_code == 0
+    
+
+def test_custom_delete(runner, custom_endpoint_name):
+    result = runner.invoke(custom_delete, [
+        "--name", custom_endpoint_name,
+        "--namespace", NAMESPACE
+    ])
+    assert result.exit_code == 0
@@ -21,7 +21,7 @@ def test_job_name():
 @pytest.fixture(scope="class")
 def image_uri():
     """Return a standard PyTorch image URI for testing."""
-    return "763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training:2.2.0-cpu-py310-ubuntu20.04-sagemaker"
+    return "448049793756.dkr.ecr.us-west-2.amazonaws.com/ptjob:mnist"
 
 @pytest.fixture(scope="class")
 def cluster_name():
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	`+Subproject commit ce96b513c3033f815d24469f07e2ef0531aaf8d4`