Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions docs/source/examples/asymmetric_e5_model/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Multilingual E5 Small Model - SageMaker & OpenSearch Integration

Deploy the `intfloat/multilingual-e5-small` model to Amazon SageMaker and connect it to OpenSearch for semantic search.

## Project Structure

```
asymmetric_e5_model/
├── sagemaker_deployment/ # SageMaker model deployment
│ ├── deploy_cli.sh # Deploy to SageMaker
│ ├── validate_cli.sh # Validate SageMaker endpoint
│ ├── model-config.json # Model configuration
│ ├── inference.py # Custom inference code
│ └── README.md
├── opensearch_connector/ # OpenSearch integration
│ ├── setup_connector.sh # Setup connector (auto-detects local/managed)
│ ├── validate_connector.sh # Validate connector
│ └── README.md
└── README.md # This file
```

## Quick Start

### 1. Deploy to SageMaker
```bash
cd sagemaker_deployment
./deploy_cli.sh
./validate_cli.sh <endpoint-name>
```

### 2. Setup OpenSearch Connector
```bash
cd opensearch_connector
./setup_connector.sh <opensearch-endpoint> <sagemaker-endpoint-name>
./validate_connector.sh <opensearch-endpoint> <model-id>
```

## Prerequisites

- AWS CLI configured with appropriate permissions
- SageMaker execution role with necessary permissions
- OpenSearch cluster with ML Commons plugin enabled
- `jq` installed for JSON parsing

## Cost Considerations

- ml.t2.medium: ~$0.056/hour (used in deployment)
- ml.m5.large: ~$0.115/hour (alternative for higher throughput)
- Use auto-scaling for production workloads
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# OpenSearch Remote Connector

Connect deployed SageMaker endpoint to OpenSearch for ML inference.

## Files

- `setup_connector.sh` - Setup connector (auto-detects local vs managed OpenSearch)
- `validate_connector.sh` - Validate connector functionality

## Prerequisites

- Deployed SageMaker endpoint
- OpenSearch cluster with ML Commons plugin enabled
- `jq` installed for JSON parsing

## Usage

The setup script automatically detects whether you're using local or managed OpenSearch:

- **Local OpenSearch** (localhost/127.0.0.1): Uses AWS access key credentials
- **Managed OpenSearch** (AWS domain): Uses IAM role credentials

```bash
chmod +x setup_connector.sh validate_connector.sh
./setup_connector.sh <opensearch-endpoint> <sagemaker-endpoint-name>
./validate_connector.sh <opensearch-endpoint> <model-id>
```

## Examples

### Local OpenSearch
```bash
./setup_connector.sh http://localhost:9200 multilingual-e5-endpoint-1761349656
./validate_connector.sh http://localhost:9200 hMW9GJoBeER1e719aVX6
```

### Managed OpenSearch
```bash
./setup_connector.sh https://search-domain.us-east-1.es.amazonaws.com multilingual-e5-endpoint-1761349656
./validate_connector.sh https://search-domain.us-east-1.es.amazonaws.com abc123def456
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
#!/bin/bash

if [ $# -ne 2 ]; then
echo "Usage: $0 <opensearch-endpoint> <sagemaker-endpoint-name>"
echo "Example: $0 http://localhost:9200 multilingual-e5-endpoint-1761349656"
exit 1
fi

OPENSEARCH_ENDPOINT=$1
SAGEMAKER_ENDPOINT=$2
REGION="us-east-1"

echo "Setting up asymmetric E5 remote model connector with post-processing..."

# Get AWS credentials
AWS_ACCESS_KEY=$(aws configure get aws_access_key_id)
AWS_SECRET_KEY=$(aws configure get aws_secret_access_key)
AWS_SESSION_TOKEN=$(aws configure get aws_session_token)

# Create connector with post_process_function to flatten response
CONNECTOR_RESPONSE=$(curl -s -X POST "${OPENSEARCH_ENDPOINT}/_plugins/_ml/connectors/_create" \
-H "Content-Type: application/json" \
-d "{
\"name\": \"sagemaker-e5-asymmetric-connector\",
\"description\": \"Connector for multilingual-e5-small asymmetric model with flattened response\",
\"version\": \"1\",
\"protocol\": \"aws_sigv4\",
\"parameters\": {
\"region\": \"${REGION}\",
\"service_name\": \"sagemaker\"
},
\"credential\": {
\"access_key\": \"${AWS_ACCESS_KEY}\",
\"secret_key\": \"${AWS_SECRET_KEY}\",
\"session_token\": \"${AWS_SESSION_TOKEN}\"
},
\"actions\": [
{
\"action_type\": \"predict\",
\"method\": \"POST\",
\"url\": \"https://runtime.sagemaker.${REGION}.amazonaws.com/endpoints/${SAGEMAKER_ENDPOINT}/invocations\",
\"headers\": {
\"content-type\": \"application/json\"
},
\"request_body\": \"{ \\\"texts\\\": \${parameters.texts}, \\\"content_type\\\": \\\"\${parameters.content_type}\\\" }\"
}
]
}")

CONNECTOR_ID=$(echo $CONNECTOR_RESPONSE | jq -r '.connector_id')

if [ "$CONNECTOR_ID" = "null" ] || [ -z "$CONNECTOR_ID" ]; then
echo "Failed to create connector:"
echo $CONNECTOR_RESPONSE
exit 1
fi

echo "✓ Connector created with post-processing: $CONNECTOR_ID"

# Register model with asymmetric identifiers
MODEL_RESPONSE=$(curl -s -X POST "${OPENSEARCH_ENDPOINT}/_plugins/_ml/models/_register" \
-H "Content-Type: application/json" \
-d "{
\"name\": \"e5-asymmetric-remote\",
\"function_name\": \"remote\",
\"connector_id\": \"${CONNECTOR_ID}\",
\"model_config\": {
\"model_type\": \"text_embedding\",
\"embedding_dimension\": 384,
\"framework_type\": \"SENTENCE_TRANSFORMERS\",
\"additional_config\": {
\"space_type\": \"l2\",
\"is_asymmetric\": true,
\"model_family\": \"e5\",
\"query_prefix\": \"query: \",
\"passage_prefix\": \"passage: \"
}
}
}")

TASK_ID=$(echo $MODEL_RESPONSE | jq -r '.task_id')
sleep 10
MODEL_ID=$(curl -s -X GET "${OPENSEARCH_ENDPOINT}/_plugins/_ml/tasks/$TASK_ID" | jq -r '.model_id')

# Deploy model
curl -s -X POST "${OPENSEARCH_ENDPOINT}/_plugins/_ml/models/$MODEL_ID/_deploy" > /dev/null
sleep 15

echo "✓ Model deployed: $MODEL_ID"
echo ""
echo "Run validation: ./validate_connector.sh $OPENSEARCH_ENDPOINT $MODEL_ID"
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#!/bin/bash

if [ $# -ne 2 ]; then
echo "Usage: $0 <opensearch-endpoint> <model-id>"
echo "Example: $0 http://localhost:9200 abc123"
exit 1
fi

OPENSEARCH_ENDPOINT=$1
MODEL_ID=$2

echo "Validating asymmetric remote model with OpenSearch ML Commons format..."

# Check model config
echo "Model configuration:"
curl -s -X GET "${OPENSEARCH_ENDPOINT}/_plugins/_ml/models/$MODEL_ID" | jq '.model_config.additional_config'

# Test query embedding
echo -e "\nTesting query embedding..."
QUERY_RESPONSE=$(curl -s -X POST "${OPENSEARCH_ENDPOINT}/_plugins/_ml/models/$MODEL_ID/_predict" \
-H "Content-Type: application/json" \
-d '{
"parameters": {
"texts": ["What is machine learning?"],
"content_type": "query"
}
}')

# With simplified format, response is wrapped in an array, so access response[0]
QUERY_DIM=$(echo $QUERY_RESPONSE | jq -r '.inference_results[0].output[0].dataAsMap.response[0] | length' 2>/dev/null)

# Test passage embedding
echo "Testing passage embedding..."
PASSAGE_RESPONSE=$(curl -s -X POST "${OPENSEARCH_ENDPOINT}/_plugins/_ml/models/$MODEL_ID/_predict" \
-H "Content-Type: application/json" \
-d '{
"parameters": {
"texts": ["Machine learning is a subset of artificial intelligence."],
"content_type": "passage"
}
}')

PASSAGE_DIM=$(echo $PASSAGE_RESPONSE | jq -r '.inference_results[0].output[0].dataAsMap.response[0] | length' 2>/dev/null)

# Validation results
if [ "$QUERY_DIM" != "null" ] && [ "$PASSAGE_DIM" != "null" ] && [ "$QUERY_DIM" -gt 0 ] && [ "$PASSAGE_DIM" -gt 0 ]; then
echo -e "\n✓ Validation successful with flattened response!"
echo "✓ Query embedding dimension: $QUERY_DIM"
echo "✓ Passage embedding dimension: $PASSAGE_DIM"
echo "✓ Post-processing function working correctly (no processing needed)"
echo "✓ Asymmetric remote model ready for neural-search"
else
echo -e "\n✗ Validation failed"
echo "Query response: $QUERY_RESPONSE"
echo "Passage response: $PASSAGE_RESPONSE"
exit 1
fi
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# SageMaker Deployment

Deploy the `intfloat/multilingual-e5-small` model to Amazon SageMaker.

## Files

- `deploy_cli.sh` - Deploy model to SageMaker endpoint
- `validate_cli.sh` - Validate deployed endpoint
- `model-config.json` - Model configuration template
- `inference.py` - Custom inference code (for future use)

## Usage

### Deploy Model
```bash
chmod +x deploy_cli.sh
./deploy_cli.sh
```

### Validate Deployment
```bash
chmod +x validate_cli.sh
./validate_cli.sh <endpoint-name>
```

## Example
```bash
./deploy_cli.sh
./validate_cli.sh multilingual-e5-endpoint-1761349656
```

## Cleanup
```bash
aws sagemaker delete-endpoint --endpoint-name <endpoint-name>
aws sagemaker delete-endpoint-config --endpoint-config-name <config-name>
aws sagemaker delete-model --model-name <model-name>
```

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
#!/bin/bash

# Set variables
TIMESTAMP=$(date +%s)
MODEL_NAME="multilingual-e5-small-$TIMESTAMP"
ENDPOINT_CONFIG_NAME="multilingual-e5-config-$TIMESTAMP"
ENDPOINT_NAME="multilingual-e5-endpoint"
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
ROLE_ARN="arn:aws:iam::${ACCOUNT_ID}:role/Admin"
REGION="us-east-1"
BUCKET="sagemaker-$REGION-$ACCOUNT_ID"

# Create code package
echo "Creating code package..."
tar -czf model.tar.gz inference.py

# Upload to S3
echo "Uploading code to S3..."
aws s3 cp model.tar.gz s3://$BUCKET/$MODEL_NAME/model.tar.gz

# Create temporary model config
sed "s/MODEL_TIMESTAMP/$TIMESTAMP/g; s|ROLE_ARN_PLACEHOLDER|$ROLE_ARN|g; s/ACCOUNT_ID/$ACCOUNT_ID/g" model-config.json > temp-model-config.json

# Create model
aws sagemaker create-model \
--region $REGION \
--cli-input-json file://temp-model-config.json

if [ $? -ne 0 ]; then
echo "Failed to create model"
rm -f temp-model-config.json model.tar.gz
exit 1
fi

# Create endpoint configuration
aws sagemaker create-endpoint-config \
--region $REGION \
--endpoint-config-name $ENDPOINT_CONFIG_NAME \
--production-variants VariantName=primary,ModelName=$MODEL_NAME,InitialInstanceCount=1,InstanceType=ml.m5.large,InitialVariantWeight=1

if [ $? -ne 0 ]; then
echo "Failed to create endpoint config"
rm -f temp-model-config.json model.tar.gz
exit 1
fi

# Create endpoint
aws sagemaker create-endpoint \
--region $REGION \
--endpoint-name $ENDPOINT_NAME \
--endpoint-config-name $ENDPOINT_CONFIG_NAME

if [ $? -ne 0 ]; then
echo "Failed to create endpoint"
rm -f temp-model-config.json model.tar.gz
exit 1
fi

echo "Deployment initiated:"
echo "Model: $MODEL_NAME"
echo "Endpoint Config: $ENDPOINT_CONFIG_NAME"
echo "Endpoint: $ENDPOINT_NAME"

# Wait for endpoint to be in service
echo "Waiting for endpoint to be ready..."
aws sagemaker wait endpoint-in-service --region $REGION --endpoint-name $ENDPOINT_NAME

if [ $? -eq 0 ]; then
echo "Endpoint $ENDPOINT_NAME is ready!"
echo "You can now validate with: ./validate_cli.sh $ENDPOINT_NAME"
else
echo "Endpoint deployment failed or timed out"
fi

# Cleanup
rm -f temp-model-config.json model.tar.gz
Loading
Loading