|
| 1 | +--- |
| 2 | +title: Using ROOT in the field of genome sequencing |
| 3 | +layout: gsoc_proposal |
| 4 | +project: ROOT |
| 5 | +year: 2025 |
| 6 | +difficulty: medium |
| 7 | +duration: 350 |
| 8 | +mentor_avail: June-November |
| 9 | +organization: |
| 10 | + - CERN |
| 11 | + - CompRes |
| 12 | +--- |
| 13 | + |
| 14 | +## Description |
| 15 | + |
| 16 | +The [ROOT](https://root.cern/) is a framework for data processing, born at CERN, |
| 17 | +at the heart of the research on high-energy physics. Every day, thousands of |
| 18 | +physicists use ROOT applications to analyze their data or to perform |
| 19 | +simulations. The ROOT software framework is foundational for the HEP ecosystem, |
| 20 | +providing capabilities such as IO, a C++ interpreter, GUI, and math |
| 21 | +libraries. It uses object-oriented concepts and build-time modules to layer |
| 22 | +between components. We believe additional layering formalisms will benefit ROOT |
| 23 | +and its users. |
| 24 | + |
| 25 | +ROOT has broader scientific uses than the field of high energy physics. Several |
| 26 | +studies have shown promising applications of the ROOT I/O system in the field |
| 27 | +of genome sequencing. This project is about extending the developed capability |
| 28 | +in [GeneROOT](https://github.com/GeneROOT) and understanding better the |
| 29 | +requirements of the field. |
| 30 | + |
| 31 | + |
| 32 | +## Expected results |
| 33 | +* Reproduce the results based on previous comparisons against ROOT master |
| 34 | +* Investigate and compare the latest compression strategies used by [Samtools](https://www.htslib.org/) for conversions to BAM, with RAM(ROOT Alignment Maps). |
| 35 | +* Explore ROOT's [RNTuple](https://root.cern/doc/v622/md_tree_ntuple_v7_doc_README.html) format to efficiently store RAM maps, in place of the previously used `TTree`. |
| 36 | +* Investigate different ROOT file splitting techniques |
| 37 | +* Produce a comparison report |
| 38 | + |
| 39 | + |
| 40 | +## Requirements |
| 41 | +* C++ and Python programming |
| 42 | +* Familiarity with Git |
| 43 | +* Knowledge of ROOT and/or the BAM file formats is a plus. |
| 44 | + |
| 45 | + |
| 46 | +## Mentors |
| 47 | +* [Martin Vasilev ](mailto:[email protected]) |
| 48 | +* [Jonas Rembser ](mailto:[email protected]) |
| 49 | +* [Fons Rademakers ](mailto:[email protected]) |
| 50 | + |
| 51 | + |
| 52 | +## Links |
| 53 | +* [Latest Presentation on GeneROOT](https://indico.cern.ch/event/655464/) |
| 54 | +* [ROOT](https://root.cern/) |
| 55 | +* [GeneROOT](https://github.com/GeneROOT) |
0 commit comments