You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This gist of the experiments is to compare them. We need a diff facility to compare inputs and results across runs.
We don't have to limit this comparison to defined outputs in the pipeline. Any file changed between two runs can be diffed.
There can be three types of diffs:
Unstructured diffs: This is for binary files that we don't recognize. Only the content digest is reported.
Structured diffs: For a file format that we can parse, we can report the individual differences across runs. JSON, YAML or any other format that we can parse for results can be reported as structured diff.
Text diffs: This is for the source code files that may have lead to changes in other files.
The workflow is as follows:
User has a bunch of files, source, params, data, model, etc.
User modifies some of these manually. e.g. updating the source code.
User modifies some of these with xvc exp run --input-param command.
User runs a command (or pipeline) on the files.
Xvc clones/rechecks/copies files from original to a directory in .xvc/exp/KEYWORD-RANDOMSTRING-TIMESTAMP directory.
Xvc links the original cache.
Xvc creates a .xvc-exp directory to store experiment specific data.
Xvc modifies the files with the given modification option.
--input-param params.yaml params.my-param 123,124,135 creates 3 experiments, each changing params.yaml::params.my-param to a given value.
Xvc runs the given command (or pipeline) in the directory
Xvc stores the updated artifacts in the common cache, symlinking the results.
User asks for results diffed from the original.
Xvc compares each of the directories for the changed files.
Xvc shows unstructured files digest strings.
Xvc shows structured files changed values.
Xvc shows text file diffs similar to Git.
All results must be reported in JSON. Tables may be built from this JSON.
The second facility xvc exp provides is to modify structured files quickly for each experiment.
xvc exp run --input-param file.yaml dict.key value1,value2,value3 will parse file.yaml, update dict.key with value1 and run an experiment, update with value2 and run another, update with value3 and run another.
xvc exp run --input-param file.json dict.key '0;5;100' will run experiments with 0,5,10,15,20,...,100 (inclusive).
Files to be modified are JSON, YAML1.2 and TOML files. (Anything serde can read/write is possible in theory.)
We can extend this functionality to regex. --input-regex file.txt 'my_var = (.*)' 0;0.1;1 updates $1 in regex with the values.
We can also use --command-template for this. xvc exp run --command-template 'python train.py ${{EXP_VALUE}}' 0;0.2;10 will run python train.py with parameters 0, 0.2, 0.4, .... in different experiments.
If there are more than one --input-param, --input-regex, --command-template parameters, we build permutations of values. xvc exp run --input-param file.yaml dict.key 1,2,3 --input-param another.yaml another.key 5,6,7 will run 9 experiments.
There may be three subcommands for xvc exp run.
xvc exp run pipeline --name: (xvcerp) Runs a pipeline command with the given parameters. (xvc pipeline run --name)
xvc exp run command 'cmd': Runs a generic command as experiment
xvc exp run template 'cmd ${{EXP_VALUE_1}} ${{EXP_VALUE_2}} 1,2,3 4,5,6 runs a command by substituing values to the command string.
--input-param and --input-regex options are available to all three of these. Maybe instead of --input-param, it's better to use --update-param and --update-regex. Maybe we can merge these, but I don't like to have corner cases.
--keyword will set the KEYWORD portion of experiment names. By default, this is exp. User may want to set to a searchable name.
The updated params, and run commands are stored in .xvc-exp directory. It may contain the exact script that was run.
This discussion was converted from issue #184 on January 24, 2023 06:57.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This gist of the experiments is to compare them. We need a
difffacility to compare inputs and results across runs.We don't have to limit this comparison to defined outputs in the pipeline. Any file changed between two runs can be diffed.
There can be three types of diffs:
The workflow is as follows:
xvc exp run --input-paramcommand..xvc/exp/KEYWORD-RANDOMSTRING-TIMESTAMPdirectory..xvc-expdirectory to store experiment specific data.--input-param params.yaml params.my-param 123,124,135creates 3 experiments, each changingparams.yaml::params.my-paramto a given value.All results must be reported in JSON. Tables may be built from this JSON.
The second facility
xvc expprovides is to modify structured files quickly for each experiment.xvc exp run --input-param file.yaml dict.key value1,value2,value3will parsefile.yaml, updatedict.keywithvalue1and run an experiment, update withvalue2and run another, update withvalue3and run another.xvc exp run --input-param file.json dict.key '0;5;100'will run experiments with0,5,10,15,20,...,100(inclusive).Files to be modified are JSON, YAML1.2 and TOML files. (Anything serde can read/write is possible in theory.)
We can extend this functionality to regex.
--input-regex file.txt 'my_var = (.*)' 0;0.1;1updates$1in regex with the values.We can also use
--command-templatefor this.xvc exp run --command-template 'python train.py ${{EXP_VALUE}}' 0;0.2;10will runpython train.pywith parameters 0, 0.2, 0.4, .... in different experiments.If there are more than one
--input-param,--input-regex,--command-templateparameters, we build permutations of values.xvc exp run --input-param file.yaml dict.key 1,2,3 --input-param another.yaml another.key 5,6,7will run 9 experiments.There may be three subcommands for
xvc exp run.xvc exp run pipeline --name: (xvcerp) Runs a pipeline command with the given parameters. (xvc pipeline run --name)xvc exp run command 'cmd': Runs a generic command as experimentxvc exp run template 'cmd ${{EXP_VALUE_1}} ${{EXP_VALUE_2}} 1,2,3 4,5,6runs a command by substituing values to the command string.--input-paramand--input-regexoptions are available to all three of these. Maybe instead of--input-param, it's better to use--update-paramand--update-regex. Maybe we can merge these, but I don't like to have corner cases.--keywordwill set theKEYWORDportion of experiment names. By default, this isexp. User may want to set to a searchable name.The updated params, and run commands are stored in
.xvc-expdirectory. It may contain the exact script that was run.Beta Was this translation helpful? Give feedback.
All reactions