-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Description
In order to make the code more useful, the following documentation would be very useful:
- An up-to-date and detailed description of the current cta-ml format (all tables, their purposes, the datatypes of each column, etc.). This is the most critical.
- The version of ctapipe it is written for (as this is the key dependency and the most likely interface to change) This should be done after Update to use refactored ctapipe event routines #15.
- Instructions for extending/modifying the code to handle new data formats (this will depend strongly on whether/when the rewrite in Refactoring to separate event/data loop from dumping to file #25 occurs).
- Instructions on performance and mass processing. Given that this code is likely to be used to process large amounts of data, it would be useful to give instructions on how best to handle the procedure (multiple jobs/multithreading). Performance benchmarks could also be provided if available to give an idea of how long processing will take (per event).
- Benchmarks on the effects of compression/chunking/indexing on write/read/search speed.
@nietootein @aribrill Anything else you can think of?