add info on preventing data loss #112

ebryerwork · 2025-02-20T18:08:04Z

ebryerwork
Feb 20, 2025
Collaborator

What should be added to the docs to make it more clear to users what to make secondary backups of and how/where to make those?

I made a pull request to add something, per Wolf's suggestion in slack, but I'm not sure what I wrote is the best way to put things. Here's the slack discussion of it. At the end Carrie suggested opening this discussion.

Erik Bryer
11:44 AM
The documentation mentions /storage/ has snapshot backups. I feel there should be a mention in the documentation that users should have a plan for making their own separate backups off-ICDS of critical files such as those needed to replicate their research. e.g. certain configuration files. (edited)

Wolf
11:49 AM
@Erik Bryer

excellent suggestion. Please would you be so kind and add some language to that effect when you have a moment?

Chad Bahrmann
11:57 AM
https://security.psu.edu/education-training/data-backup/
11:57
or potential other backup strategies for any given unit

Erik Bryer
12:42 PM
That's awesome
@chad
. That's exactly the kind of feedback I was looking for. How does one encourage users to apply such principles while simultaneously not unduly arousing fear? I don't feel 100% equipped at present to draft language on it. But I thought the principle was important to raise. Even so, I did what Wolf suggested an made a pull request with a first pass. I raised the issue because I think users can hack on things like code and configuration files for months and then the compute runs take days. So I figure it's best to have the little bit of data (representing a great deal of $) that would be needed to replicate the output stored in more than one location. (edited)
New

Carrie Brown
1:01 PM
When I was at Nebraska I wrote this page: https://hcc.unl.edu/docs/handling_data/data_storage/preventing_file_loss/
I would fully support similar content being placed in our documentation.
1:01
If you don't feel comfortable writing it yourself or submitting a pull request, definitely capture this in discussion or as an issue so we can add comments and compile additional information like Chad added. This will make it much easier for us to develop this document when we move forward with doing so.

nucci6 · 2025-02-20T21:07:34Z

nucci6
Feb 20, 2025
Collaborator

Another possible option here is an OSN allocation via ACCESS-CI. I have a small OSN allocation that I have been using for testing for off-site backups for some of my project work.I was exclusively using rclone for the UI but I find rclone a bit clunky. Doug Dobson set up a Globus connector for me so now I use the Globus UI and tools to use it. OSN via ACCESS-CI does not support Globus guest collections, so I don't know of an easy way to do any automation (yet) with Globus, but it can be done with rclone sync.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add info on preventing data loss #112

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

add info on preventing data loss #112

Uh oh!

ebryerwork Feb 20, 2025 Collaborator

Replies: 1 comment

Uh oh!

nucci6 Feb 20, 2025 Collaborator

ebryerwork
Feb 20, 2025
Collaborator

nucci6
Feb 20, 2025
Collaborator