Skip to content

Commit d7b6bd5

Browse files
authored
Create proposal_ATLAS_lossy_compression.md
1 parent 1c0a0db commit d7b6bd5

File tree

1 file changed

+42
-0
lines changed

1 file changed

+42
-0
lines changed
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
---
2+
title: Precision Recovery in Lossy-Compressed Floating Point Data for High Energy Physics
3+
layout: gsoc_proposal
4+
project: ATLAS
5+
year: 2025
6+
organization:
7+
- ANL
8+
- CERN
9+
difficulty: medium
10+
duration: 350
11+
mentor_avail: July-September
12+
---
13+
14+
## Description
15+
16+
[ATLAS](http://atlas.cern) is one of the particle physics experiments at the [Large Hadron Collider](http://home.web.cern.ch/topics/large-hadron-collider) (LHC) at [CERN](http://home.cern/). With the planned upgrade of the LHC (the so-called High Luminosity phase), allowing for even more detailed exploration of fundamental particles and forces of nature, it is expected that the recorded data rate will be up to ten times greater than today. One of the methods of addressing this storage challenge is data compression. The traditional approach involves lossless compression algorithms such as zstd and zlib. To further reduce storage footprint, methods involving lossy compression are being investigated. One of the solutions in High Energy Physics is the reduction of floating point precision, as stored precision may be higher than detector resolution. However, when reading data back, physicists may be interested in restoring the precision of the floating point numbers. This is obviously impossible in the strict sense, as the process of removing bits is irreversible. Nevertheless, given that the data volume is high, some variables are correlated, and follow specific distributions, one may consider a machine learning approach to recover the lossy-compressed floating-point data.
17+
18+
## Task ideas
19+
20+
* Perform lossy compression of data sample from the ATLAS experiment
21+
* Investigate ML techniques for data recovery, prediction and upscaling
22+
* Integrate the chosen technique into HEP workflow
23+
24+
## Expected results
25+
26+
* Implementation of ML-based procedure to restore precision of lossy-compressed floating-point numbers in ATLAS data
27+
* Evaluation of the method's performance (decompression accuracy) and its applicability in HEP workflow
28+
29+
## Requirements
30+
31+
* C++, Python, Machine Learning
32+
33+
## Mentors
34+
35+
* **[Maciej Szymański](mailto:[email protected])**
36+
* [Peter Van Gemmeren](mailto:[email protected])
37+
38+
## Links
39+
40+
* [IEEE_754](https://en.wikipedia.org/wiki/IEEE_754)
41+
* [Implementation of FloatCompressor in Athena](https://gitlab.cern.ch/atlas/athena/-/blob/main/Control/CxxUtils/Root/FloatCompressor.cxx)
42+

0 commit comments

Comments
 (0)