Skip to content

radar-hdfs-restructure version 0.5.0

Compare
Choose a tag to compare
@blootsvoets blootsvoets released this 26 Jul 14:54
04583dc

Use a plugin architecture to specify:

  • path layout: for binning (how many hours for a file) and organisation (project/user/topic/time.csv or topic/project/user/time.csv, project/user/topic.csv, etc.).
  • file format: currently csv or json
  • compression method: currently gzip or none
  • storage driver: currently local, but could be minio or s3.

This makes the module much more extensible for other needs or projects.

Other updates:

  • threaded task model
  • deduplication now does not change ordering, and does not use another temporary file
  • files are now atomically moved from staging directory if possible
  • bins and offsets are written from separate thread, using single Accountant class
  • settings and factories are propagated through the application with the FileStoreFactory.