-
Notifications
You must be signed in to change notification settings - Fork 16
Adding artifacts upload through S3/Minio #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
pebeto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the initiative to make this library support remote artifact logging in S3.
I can't accept these changes until you implement a valid test suite. You can use BrokenRecord.jl to reproduce HTML requests.
|
What about a test that creates an experiment and logs a text file and a figure.
Another question: Should the s3 logging be a package extension? There are multiple ways to use S3, for example through minio (which I have setup and can test), or vanilla AWS S3. For testing, it may be useful to define a reproducible test environment with mlflow and minio (S3) in a docker container. |
|
The described tests are okay. Is it possible to avoid adding a new package dependency by calling the API directly? Mostly to avoid pulling more from registries. It's okay to setup a MiniO instance in the CI pipeline, but I don't consider it practical for local testing. Let's see recording the request response and mocking it using BrokenRecord.jl. If that's not possible, we can think about adding one more development dependency. |
|
Cool, I'll set up the tests. Do you mean not adding I looked into BrokenRecord.jl, but I think it's a bit cumbersome. Doing it this way requires to manually handle traffic logging and keep track of the files. It's also locked in to my particular setup. A more comfortable way would be to define the extend the .devcontainer/compose.yaml. I'm trying out the setup below right now, it seems to be fine for local use. Minio doesn't add too much overhead. |
|
Ok, I got the unit tests passing with the docker file above. But with this setup, using
|
|
Oh, the unit tests probably fail because the docker image is not using |
Yes, because those operations are just compatible REST API calls, right? The docker-compose way to setup the development environment is great. Thanks to set it up!
The Authentication API is documented here, and to make it work you need to configure the instance as the CI pipeline does |
.github/workflows/CI.yml
Outdated
| python3 /opt/hostedtoolcache/Python/3.12.3/x64/bin/mlflow server --app-name basic-auth --host 0.0.0.0 --port 5000 & | ||
| sleep 5 | ||
| - name: Start services | ||
| run: docker-compose -f docker compose.test.yaml up -d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The custom mlflow setup is related to the way lts and release CI pipelines treat the instance. Instead of changing it, you could setup MiniO using minio-action.
| - uses: julia-actions/julia-buildpkg@v1 | ||
| - uses: julia-actions/julia-runtest@v1 | ||
| env: | ||
| JULIA_NUM_THREADS: '2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change required? Also the URI must not be modified.
test/assets/julia.png
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about generating an image instead of saving one to the repository? You can add Plots.jl to the test suite, create a plot and save it as a figure under execution.
test/runtests.jl
Outdated
| `docker-compose .devcontainers/compose.yaml up`. | ||
| Then set the environment variables | ||
| MLFLOW_TRACKING_URI="http://localhost:5050/api" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case we setup the development environment as required, this message won't be necessarily. I mean, it's okay to warn the user to run the tests with the docker setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Thanks for the review. I had some unit tests commented out because I didn't have mlflow[auth] set up in docker. It's set up now and all unit tests pass. I'll take a closer look at the minio-action you suggested above. Also the changes like the threads are just me not knowing what is going on yet, I'll revert them. Regarding the plots.jl test-suite: For figures (either Plots.jl or Makie.jl) a package extension would be cool. That way neither plotting package is a direct dependency for |
|
Ok, all commits on this should be included in this PR (used 'git pull origin fetch'). Am I missing anything? |
This PR adds support for uploading artifacts to an S3 bucket associated with a run.
Example usage:
The PR also
createrunto add the current time ifstart_timeis missing.data["output"]when creating a run.I do not know how to test
logartifactexcept that "it works on my machine". I've been using it a bit to log pngs and seems to do fine.