Trinity-Mini-DrugProt-Think/README.md at main · LokaHQ/Trinity-Mini-DrugProt-Think

Trinity-Mini-DrugProt-Think
RLVR (GRPO) + LoRA post-training on Arcee Trinity Mini for DrugProt relation classification.

📝 Report | AWS deployment guide | Model

This repo contains two distinct tracks:

Experiments: RL training configs and artifacts for DrugProt.
Serving: Deploy a Hugging Face base model + LoRA adapter to a SageMaker real-time endpoint (SageMaker SDK v3).

Repo layout

experiments/configs/rl/: RL experiment configs (*.toml).
index.html: blog post (static HTML + Chart.js, serves from repo root).
data/: exported metrics CSVs used by the blog post charts.
experiments/reports/: supplementary writeups (deployment guide, etc.).
serving/lora_inference/: SageMaker v3 InferenceSpec + container requirements.
serving/scripts/: deploy/test/delete endpoint scripts.

Experiments

Prerequisites

Prime CLI installed (prime --version), e.g. uv tool install prime.
Logged in (prime login).
W&B API key available.

Secrets

Configs use env_files = ["secrets.env"], so put the secrets file next to the configs:

experiments/configs/rl/secrets.env (gitignored)

WANDB_API_KEY=your_key_here

Run a baseline

Baseline config: experiments/configs/rl/w1_alpha16_baseline.toml

prime rl run experiments/configs/rl/w1_alpha16_baseline.toml

Monitor a run

prime rl progress <run_id>
prime rl logs <run_id> -f
prime rl metrics <run_id>
prime rl distributions <run_id> --type rewards
prime rl rollouts <run_id> --step <step> -n 50

Serving (SageMaker real-time endpoint)

Deploy a Hugging Face base model with a LoRA adapter to a SageMaker real-time endpoint using SageMaker SDK v3 (ModelBuilder + InferenceSpec).

What’s implemented

serving/lora_inference/spec.py: loads base model + LoRA adapter and runs generation.
serving/lora_inference/requirements.txt: container-time inference dependencies.
serving/scripts/deploy_lora_endpoint.py: deploy/update endpoint.
serving/scripts/test_lora_endpoint.py: invoke endpoint.
serving/scripts/delete_lora_endpoint.py: delete endpoint.

Prerequisites

AWS credentials configured locally (~/.aws/credentials or env vars).
SageMaker execution role ARN with model/endpoint permissions.
Optional (private Hugging Face repos): HF_TOKEN env var.

Install local dependencies

uv sync

Deploy from a Hugging Face adapter repo

export SAGEMAKER_ROLE_ARN="arn:aws:iam::<account-id>:role/<sagemaker-role>"
export HF_TOKEN="hf_xxx"   # optional for private repos

uv run python serving/scripts/deploy_lora_endpoint.py \
  --endpoint-name trinity-mini-drugprot-think \
  --adapter-id lokahq/Trinity-Mini-DrugProt-Think \
  --role-arn "$SAGEMAKER_ROLE_ARN" \
  --instance-type ml.g5.2xlarge

Deploy from a local adapter directory

uv run python serving/scripts/deploy_lora_endpoint.py \
  --endpoint-name trinity-mini-drugprot-think-local \
  --adapter-id ./adapter \
  --role-arn "$SAGEMAKER_ROLE_ARN" \
  --instance-type ml.g5.2xlarge

If ./adapter/adapter_config.json exists, the server will resolve the base model from the adapter metadata. You can always override with --base-model-id <hf-model-id>.

Update an existing endpoint

uv run python serving/scripts/deploy_lora_endpoint.py \
  --endpoint-name trinity-mini-drugprot-think \
  --adapter-id lokahq/Trinity-Mini-DrugProt-Think \
  --role-arn "$SAGEMAKER_ROLE_ARN" \
  --update-endpoint

Test invocation

uv run python serving/scripts/test_lora_endpoint.py \
  --endpoint-name trinity-mini-drugprot-think \
  --prompt "Give me one practical use-case of LoRA adapters in production." \
  --max-new-tokens 120 \
  --temperature 0.7 \
  --top-p 0.95

Payload contract

Request JSON:

{
  "inputs": "string prompt",
  "max_new_tokens": 256,
  "temperature": 0.7,
  "top_p": 0.95,
  "do_sample": true
}

Response JSON:

{
  "generated_text": "...",
  "full_text": "...",
  "model_id": "<hf-base-model-id>",
  "adapter_id": "<adapter path or repo id>",
  "model_name": "Trinity-Mini-DrugProt-Think"
}

Acknowledgements

Model: Arcee AI (with Prime Intellect and Datalogy) for releasing the Trinity family.
Training: Prime Intellect for hosted training infrastructure.
Environment: OpenMed for DrugProt dataset packaging.
Deployment: AWS for deployment and hosting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Links

Repo layout

Experiments

Prerequisites

Secrets

Run a baseline

Monitor a run

Serving (SageMaker real-time endpoint)

What’s implemented

Prerequisites

Install local dependencies

Deploy from a Hugging Face adapter repo

Deploy from a local adapter directory

Update an existing endpoint

Test invocation

Payload contract

Acknowledgements

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Links

Repo layout

Experiments

Prerequisites

Secrets

Run a baseline

Monitor a run

Serving (SageMaker real-time endpoint)

What’s implemented

Prerequisites

Install local dependencies

Deploy from a Hugging Face adapter repo

Deploy from a local adapter directory

Update an existing endpoint

Test invocation

Payload contract

Acknowledgements