-
Couldn't load subscription status.
- Fork 73
Design
- long running jobs: report progress to user, continue where left off after interruption (checkpoint/restart) and provide common method to halt job
- invoke standard linux tools where possible, e.g., grep
- parallel techniques: master/worker, distributed queue, distributed task graph
- define common file formats for input / output between tools
- posix i/o wrappers to retry on non-fatal errors (e.g., EINTR)
- component to manipulate paths (e.g., basename, dirname, transform /a/b/../c// into /a/c)
- abstraction for file meta data (stat data) to access fields and transfer between procs
- API to read / write file meta data structures to files
- API to filter and sort file meta data structures
- parallel directory walk
- parallel pipe from one tool to another
- list
- find
- copy
- rsync
- remove
- tar/zip
- grep
- compare
- Lustre
- Panasas
- GPFS
- NFS
- PLFS
- SCR
- ADIOS
Reading through the tar code today to see how it handles xattrs and came across this as an answer to the sub-second timestamps... tar uses functions like get_stat_atime() defined in stat-time.h to fetch the timestamp from a stat structure: http://www.gnu.org/software/gnulib/coverage/gllib/stat-time.h.gcov.frameset.html Then it uses utimensat() to set the timestamps.
The github.com/hpc URL is a github "organization", which is a grouping of related projects, one of which is bayer. We created the bayer project just for this collaboration. Most of the projects under the hpc oranization are open, so that anyone can access them, but we created bayer to be private until we release it. The github.com/hpc/bayer URL is the main page for the bayer project.
The dcp code lives outside of bayer as its own github/hpc project, because dcp existed before we started the bayer effort. That's the same story with libcircle, dtcmp, and lwgrp -- all of those are components that we're using within bayer but they existed before we started our collaboration.
The libbayer library is where we can share common code between tools, e.g., "reliable" POSIX IO calls, memory allocation routines, certain canned uses of libcircle, and the like. Whenever there is a routine that more than one tool can use, let's keep that routine in libbayer.
So the whole picture looks something like this:
lwgrp: light-weight group library
- implements collectives using light-weight representations of MPI communicators
dtcmp: datatype comparison library
- implements parallel sort algorithms
- uses: lwgrp
libcircle: load balancing library
dcp: original parallel copy tool
- uses: libcircle
- now includes a "bayer" branch that uses libbayer
bayer: parallel file system tools
- libbayer: common library available to all tools
- uses: libcircle, dtcmp, lwgrp
- tools (so far):
- dwalk - parallel list
- drm - parallel remove
- dtar - parallel tar
- buildme scripts: commands to build libbayer and the tools (including dcp)