Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 56 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
# ModelPlane

Develop new evaluators / annotators.
# modelplane - an AI evaluator development platform

## ⚠️ Content warning

Expand All @@ -12,49 +10,76 @@ These data come with the following warning:
>Consider carefully whether you need to view the prompts and responses, limit exposure to what's necessary, take regular breaks, and stop if you feel uncomfortable.
>For more information on the risks, see [this literature review](https://www.zevohealth.com/wp-content/uploads/2024/07/lit_review_IN-1.pdf) on vicarious trauma.

## Get Started
## Quickstart

You must have docker installed on your system. The
given docker-compose.yaml file will start up:
You must have a docker engine installed on your system. The given
docker-compose.yaml file has definitions for running the following services
locally:

* mlflow tracking server + postgres
* jupyter

1. Clone the repository:
```bash
git clone https://github.com/mlcommons/modelplane.git
cd modelplane
```
1. Environment:
1. Adjust the .env file as needed. The committed .env /
docker-compose.yaml will bring up mlflow, postgres, jupyter, set up
mlflow to use a local disk for artifact storage.
1. Set up secrets for accessing SUTs, as needed in
`modelplane/flightpaths/config/secrets.toml`. See [modelbench](https://github.com/mlcommons/modelbench) for more details.
1. Stage your input data in `modelplane/flightpaths/data`. You can get a
sample input file [here](https://github.com/mlcommons/ailuminate/tree/main).
1. Bring up the services:
```bash
./start_services.sh -d
```
If you are using the cli only, and not using jupyter, you must pass the `no-jupyter` option:
```bash
./start_services.sh -d --no-jupyter
```
1. Visit the [Jupyter Server](http://localhost:8888/?token=changeme). The
First, clone this repo:
```bash
git clone https://github.com/mlcommons/modelplane.git
cd modelplane
```

If you plan to share notebooks, clone
[modelplane-flights](https://github.com/mlcommons/modelplane-flights) as well. Both `modelplane`
and `modelplane-flights` should be in the same directory.

Finally, set up secrets for accessing SUTs, as needed in
`modelplane/flightpaths/config/secrets.toml`. See [modelbench](https://github.com/mlcommons/modelbench) for more details.


### Running jupyter locally against the MLCommons mlflow server.

1. Ensure you have access to the MLCommons mlflow tracking
and artifact server. If not, email
[[email protected]](mailto:[email protected])
for access.
1. Modify `.env.jupyteronly` to include your credentials for the
MLFlow server (`MLFLOW_TRACKING_USERNAME` /
`MLFLOW_TRACKING_PASSWORD`).
* Alternatively, put the credentials in `~/.mlflow/credentials` as described [here](https://mlflow.org/docs/latest/ml/auth/#credentials-file).
1. To access `modelbench-private` code (assuming you have
access), you must also set `USE_MODELBENCH_PRIVATE=true` in `.env.jupyteronly`. This will forward your ssh agent to the container
allowing it to load the private repository to build the image.
1. Start jupyter with `./start_jupyter.sh`. (You can add the
`-d` flag to start in the background.)

### Running jupyter and mlflow locally.

1. Adjust the `.env` file as needed. The committed `.env` /
`docker-compose.yaml` will bring up mlflow, postgres, jupyter, and set up mlflow to use a local disk for artifact storage.
1. Start services with `./start_services.sh`. (You can add the
`-d` flag to start in the background.)

* If you are using the cli only, and not using jupyter, you must pass the `--no-jupyter` option:
`./start_services.sh -d`

## Getting started in JupyterLab.

1. Visit the [Jupyter Server](http://localhost:8888/lab?token=changeme). The
token is configured in the .env file. You shouldn't need to enter it
more than once (until the server is restarted). You can get started with
the template notebook or create a new one.
1. The runs can be monitored in MLFlow wherever you have that set up. If
local with the default setup, http://localhost:8080.
1. You should see the `flights` directory, which leads to the
`modelplane-flights` repository. Create a user directory
for yourself (`flights/users/{username}`) and either
copy an existing flightpath there or create a notebook from
scratch.
* You can manage branches and commits for
`modelplane-flights` directly from jupyter.

## CLI

You can also interact with modelplane via CLI. Run `poetry run modelplane --help`
for more details.

*Important:* You must set the `MLFLOW_TRACKING_URI` environmental variable.
For example, if you've brought up MLFlow using the docker compose process above,
For example, if you've brought up MLFlow using the fully local docker compose process above,
you could run:
```
MLFLOW_TRACKING_URI=http://localhost:8080 poetry run modelplane get-sut-responses --sut_id {sut_id} --prompts tests/data/prompts.csv --experiment expname
Expand Down
2 changes: 2 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ services:
- "8888:8888"
volumes:
- ./flightpaths:/app/flightpaths
# Volume not needed if not using modelplane-flights for sharing notebooks
- ../modelplane-flights:/app/flightpaths/flights
# Volume not needed if using cloud storage for artifacts
- ./mlruns:/mlruns
# Below needed for dvc (via git) support (backed by GCP)
Expand Down
69 changes: 68 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ jsonlines = "^4"
numpy = "^2"
matplotlib = "^3"
jupyter = "^1"
jupyterlab-git = "*"
scikit-learn = "^1.5.0"
pandas = "^2.2.2"
# plugins (would like to figure out a better way to manage these)
Expand Down