|
1 | | -# `trustyai-ragas` <br> Ragas as an Out-of-Tree Llama Stack Provider |
| 1 | +<p align="center"> |
| 2 | + <img src="https://raw.githubusercontent.com/trustyai-explainability/llama-stack-provider-ragas/main/docs/_static/provider-logo.png" alt="Llama Stack Provider" height="120"> |
| 3 | +</p> |
| 4 | + |
| 5 | +# Ragas as an External Provider for Llama Stack |
| 6 | + |
| 7 | +[](https://pypi.org/project/llama-stack-provider-ragas/) |
2 | 8 |
|
3 | | -⚠️ Warning! This project is in early stages of development! |
4 | 9 |
|
5 | 10 | ## About |
6 | 11 | This repository implements [Ragas](https://github.com/explodinggradients/ragas) as an out-of-tree [Llama Stack](https://github.com/meta-llama/llama-stack) evaluation provider. |
@@ -34,24 +39,58 @@ There are two versions of the provider: |
34 | 39 | ```bash |
35 | 40 | uv pip install -e ".[dev]" |
36 | 41 | ``` |
37 | | -- Run the Llama Stack server with the distribution configs. The distribution is a simple LS distribution that uses Ollama for inference and embeddings, and includes both the inline and remote Ragas providers. Counting the number of `run`s in this command is left as an exercise for the reader: |
38 | | - ```bash |
39 | | - dotenv run uv run llama stack run distribution/run.yaml |
40 | | - ``` |
| 42 | +- The sample LS distributions (one for inline and one for remote provider) is a simple LS distribution that uses Ollama for inference and embeddings. See the provider-specific sections below for setup and run commands. |
| 43 | +
|
| 44 | +### Remote provider (default) |
| 45 | +
|
| 46 | +Create a `.env` file with the following: |
| 47 | +```bash |
| 48 | +# Required for both inline and remote |
| 49 | +EMBEDDING_MODEL=all-MiniLM-L6-v2 |
| 50 | +
|
| 51 | +# Required for remote provider |
| 52 | +KUBEFLOW_LLAMA_STACK_URL=<your-llama-stack-url> |
| 53 | +KUBEFLOW_PIPELINES_ENDPOINT=<your-kfp-endpoint> |
| 54 | +KUBEFLOW_NAMESPACE=<your-namespace> |
| 55 | +KUBEFLOW_BASE_IMAGE=quay.io/diegosquayorg/my-ragas-provider-image:latest |
| 56 | +KUBEFLOW_PIPELINES_TOKEN=<your-pipelines-token> |
| 57 | +KUBEFLOW_RESULTS_S3_PREFIX=s3://my-bucket/ragas-results |
| 58 | +KUBEFLOW_S3_CREDENTIALS_SECRET_NAME=<secret-name> |
| 59 | +``` |
| 60 | +
|
| 61 | +Where: |
| 62 | +- `KUBEFLOW_LLAMA_STACK_URL`: The URL of the llama stack server that the remote provider will use to run the evaluation (LLM generations and embeddings, etc.). If you are running Llama Stack locally, you can use [ngrok](https://ngrok.com/) to expose it to the remote provider. |
| 63 | +- `KUBEFLOW_PIPELINES_ENDPOINT`: You can get this via `kubectl get routes -A | grep -i pipeline` on your Kubernetes cluster. |
| 64 | +- `KUBEFLOW_NAMESPACE`: The name of the data science project where the Kubeflow Pipelines server is running. |
| 65 | +- `KUBEFLOW_PIPELINES_TOKEN`: Kubeflow Pipelines token with access to submit pipelines. If not provided, the token will be read from the local kubeconfig file. |
| 66 | +- `KUBEFLOW_BASE_IMAGE`: The image used to run the Ragas evaluation in the remote provider. See `Containerfile` for details. There is a public version of this image at `quay.io/diegosquayorg/my-ragas-provider-image:latest`. |
| 67 | +- `KUBEFLOW_RESULTS_S3_PREFIX`: S3 location (bucket and prefix folder) where evaluation results will be stored, e.g., `s3://my-bucket/ragas-results`. |
| 68 | +- `KUBEFLOW_S3_CREDENTIALS_SECRET_NAME`: Name of the Kubernetes secret containing AWS credentials with write access to the S3 bucket. Create with: |
| 69 | + ```bash |
| 70 | + oc create secret generic <secret-name> \ |
| 71 | + --from-literal=AWS_ACCESS_KEY_ID=your-access-key \ |
| 72 | + --from-literal=AWS_SECRET_ACCESS_KEY=your-secret-key \ |
| 73 | + --from-literal=AWS_DEFAULT_REGION=us-east-1 |
| 74 | + ``` |
| 75 | +
|
| 76 | +Run the server: |
| 77 | +```bash |
| 78 | +dotenv run uv run llama stack run distribution/run-remote.yaml |
| 79 | +``` |
| 80 | +
|
| 81 | +### Inline provider (need to specify `.inline` in the module name) |
41 | 82 |
|
42 | | -### Inline provider |
| 83 | +Create a `.env` file with the required environment variable: |
| 84 | +```bash |
| 85 | +EMBEDDING_MODEL=all-MiniLM-L6-v2 |
| 86 | +``` |
43 | 87 |
|
44 | | -### Remote provider |
45 | | -- Create a `.env` file with the following: |
46 | | - - `LLAMA_STACK_URL` |
47 | | - - This is the url of the llama stack server that the remote provider will use to run the evaluation (LLM generations and embeddings, etc.). If you are running Llama Stack locally, you can use [ngrok](https://ngrok.com/) to expose it to the remote provider. |
48 | | - - `KUBEFLOW_PIPELINES_ENDPOINT` |
49 | | - - You can get this via `kubectl get routes -A | grep -i pipeline` on your Kubernetes cluster. |
50 | | - - `KUBEFLOW_NAMESPACE` |
51 | | - - This is the name of the data science project where the Kubeflow Pipelines server is running. |
52 | | - - `KUBEFLOW_BASE_IMAGE` |
53 | | - - This is the image used to run the Ragas evaluation in the remote provider. See `Containerfile` for details. There is a public version of this image at `quay.io/diegosquayorg/my-ragas-provider-image:latest`. |
| 88 | +Run the server: |
| 89 | +```bash |
| 90 | +dotenv run uv run llama stack run distribution/run-inline.yaml |
| 91 | +``` |
54 | 92 |
|
| 93 | +You will notice that `run-inline.yaml` file has the module name as `llama_stack_provider_ragas.inline`, in order to specify the inline provider. |
55 | 94 |
|
56 | 95 | ## Usage |
57 | 96 | See the demos in the `demos` directory. |
0 commit comments