Skip to content

Set up Globus account for tape backup of LOKI logs  #1648

@msdisme

Description

@msdisme

@computate @schwesig here is the documentation for using NESE tape: https://nese.readthedocs.io/en/latest/user-docs.html#nese-tape also at BU: https://www.bu.edu/tech/support/research/computing-resources/globus/

I took a pass at summarizing, though way more info (and links to Globus) are in the docs above:

  1. Access is via Globus only. There is no SSH, NFS, S3, or rsync access. You must use the Globus web UI, Globus CLI, or Globus SDK, and you must know the Globus Collection name assigned to your NESE Tape allocation.

  2. Data is written to a disk-based staging area first, not directly to tape. Files are uploaded to staging via Globus and then migrated to tape automatically based on lifecycle policies.

  3. The staging area has a limited quota (typically ~10 TB or ~2% of tape capacity, larger for bigger allocations). Backup workflows must account for staging space and avoid overruns during large ingest bursts.

  4. Recommended file sizes are roughly between 1 GiB and 1 TiB. Many small files are strongly discouraged for tape efficiency reasons.

  5. There is a file-count quota based on an average of ~100 MB per file across the total allocation. Generating large numbers of small files will hit this limit quickly.

  6. Lifecycle behavior is automatic and policy-driven. Newly written files migrate to tape after a short delay, older or less-accessed files may be stubbed, and accessing stubbed files triggers a tape recall with non-trivial latency.

  7. Files smaller than ~100 MB tend to remain on disk as well as being copied to tape, which can distort expectations if you assume everything leaves staging quickly.

  8. Deleting files via Globus does not immediately reclaim tape space. Tape reclamation is a manual administrative process and must be coordinated with NESE.

  9. Data is encrypted in transit via Globus, but it is not encrypted at rest on tape. If encryption at rest is required, files must be encrypted before upload.

  10. Automation can be built using the Globus CLI or SDK. Backup designs should assume non-instant restore times and explicitly handle tape recall behavior during recovery workflows.

I'll set up a meeting with someone from IT for the week of the 27th or first week of february so we may discuss how best to move forward before I set up the Globus account.

I will do a set up, add @aabaris, @computate @schwesig @naved001 @joachimweyl as admins/owners.

@computate, @schwesig I am gonna leave the details on the actual "back up LOKI to tape drive" story to you guys if that is ok.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions