Skip to content

Using tags to mark results for retention #115

@loj

Description

@loj

Origin: DataLad matrix chat; Nov 2, 2023

OP asks for suggestions on how to handle marking results for retention with tags:

How do you generally proceed in datalad about marking results for retention? My analysis produces results that are larger than the input data. So I would like to find a compromise between keeping intermediate results for some versions but not for all of them. For a single dataset one can solve this using git tag to keep annexed content of a certain commit from being listed by git annex unused.

However, how do you do it for subdatasets? My analysis has another analysis as a submodule and relies on these data as an input to the computations. So for each tag on the superdataset I would need to create a tag in the subdataset and push these tags to their respective remotes on the archive disk.

So far this can be solved by e.g. if I create a tag project-meeting21 in the superdataset, I could automatically create a tag in the subdataset called needed/(datalad-id-superdataset)/project-meeting21. Now I want to also delete needed/ tags in the subdataset if the corresponding tag in the superdataset is gone. This can lead to problems if I have the datasets in multiple places and delete tags in one of them. How to decide when to delete the needed/ tags and how to make sure that if I delete a tag, it is not added back from another instance of the dataset?

Is there any partial or complete solution to this yet or should I make up a solution on my own?

TODO (not necessarily to be performed in this order)

  • Inform OP/Add reference to this issue at origin
  • Clarifying Qs asked or not needed
  • Nature of the issue is understood
  • Inform OP about resolution

Metadata

Metadata

Assignees

No one assigned

    Labels

    support-trackerTrack a support event that occurred elsewherevia-datalad-channelreport origin is a datalad-specific channel (chat/email/office hour)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions