Refactor input handling to log inputs as artifacts for the current run. #71

superdosh · 2025-08-22T14:51:01Z

Implements first two bullets of #68

github-actions · 2025-08-22T14:51:09Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

bkorycki · 2025-08-26T23:17:20Z

src/modelplane/utils/input.py

-        pass
+    def log_artifact(self, current_run_id: str):
+        """Log the dataset to MLflow as an artifact for the given `current_run_id`."""
+        mlflow.log_artifact(str(self.local_path()), run_id=current_run_id)


Why only log the local path? Can we also log the URL if it's a DVC input? Or the input is an artifact of a previous run?

@bkorycki str(self.local_path()) is to provide it with the location of the file. The whole file is "logged" to mlflow, so it's visible/browse-able in that interface.

One thing we're losing is the details of the origin of the file, I could keep that as tags on the run maybe? Like an input tag with a nice string that represents the input?

Ah got it. Yes, I think it would be nice to keep track of the provenance of these data files. Maybe log_input makes most sense for this?

mlflow.log_input is just kind of annoying to deal with. (We had to have all that weird metadata stuff.) I feel like we could be much more human-friendly with tags?

I think we could log all the input params that we use to load the file. That might be simplest. Let me mock it up in a commit!

Updated; results in something like this:

Have a look @bkorycki

Looks great! Thanks:)

…force MLflow active run requirement

Refactor input handling to log inputs as artifacts for the current run.

65b99a6

Clean up input tests.

682bc31

superdosh marked this pull request as ready for review August 22, 2025 19:43

superdosh requested a review from a team as a code owner August 22, 2025 19:43

superdosh requested a review from bkorycki August 22, 2025 19:44

bkorycki reviewed Aug 26, 2025

View reviewed changes

superdosh added 2 commits August 27, 2025 13:15

Log input origin as tags.

176157b

Remove current_run_id parameter from build_and_log_input calls and en…

a6d75a8

…force MLflow active run requirement

superdosh requested a review from bkorycki August 27, 2025 17:45

bkorycki approved these changes Aug 27, 2025

View reviewed changes

superdosh merged commit dc5400f into main Aug 27, 2025
3 checks passed

superdosh deleted the data-artifact-mgmt branch August 27, 2025 19:05

github-actions bot locked and limited conversation to collaborators Aug 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor input handling to log inputs as artifacts for the current run. #71

Refactor input handling to log inputs as artifacts for the current run. #71

Uh oh!

superdosh commented Aug 22, 2025

Uh oh!

github-actions bot commented Aug 22, 2025 •

edited

Loading

Uh oh!

bkorycki Aug 26, 2025

Uh oh!

superdosh Aug 27, 2025 •

edited

Loading

Uh oh!

bkorycki Aug 27, 2025

Uh oh!

superdosh Aug 27, 2025

Uh oh!

superdosh Aug 27, 2025

Uh oh!

bkorycki Aug 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refactor input handling to log inputs as artifacts for the current run. #71

Refactor input handling to log inputs as artifacts for the current run. #71

Uh oh!

Conversation

superdosh commented Aug 22, 2025

Uh oh!

github-actions bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bkorycki Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

superdosh Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bkorycki Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

superdosh Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

superdosh Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

bkorycki Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Aug 22, 2025 •

edited

Loading

superdosh Aug 27, 2025 •

edited

Loading