Skip to content

Commit 62b3c9d

Browse files
michmxvvolkl
andauthored
HSF Conditions DB Project Proposal for GSoC 2025 (#1680)
* Intelligent Log Analysis for the HSF CondDB * Adding a summary for the project_HSFCondDB.md * Fixing order in mentors.md * Adding links to the nopayload project * Adding links to the nopayload project * Recovering old BNL logo as it is used in pages from previous years * Recovering description from past years for BNL. --------- Co-authored-by: Valentin Volkl <[email protected]>
1 parent c4e9834 commit 62b3c9d

File tree

5 files changed

+100
-0
lines changed

5 files changed

+100
-0
lines changed

_gsocorgs/2025/bnl.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
title: "Brookhaven National Laboratory"
3+
author: "Michel Hernandez Villanueva"
4+
layout: default
5+
organization: BNL
6+
logo: BNL-logo.png
7+
description: |
8+
Brookhaven National Laboratory (BNL) is a multipurpose research laboratory located in Upton, New York.
9+
It is operated by Brookhaven Science Associates for the U.S. Department of Energy.
10+
It hosts the Relativistic Heavy Ion Collider, the future Electron-Ion Collider and the National Synchrotron Light Source II.
11+
BNL scientists are part of major HEP experiments, such as ATLAS, Belle II, and DUNE.
12+
---
13+
14+
{% include gsoc_proposal.ext %}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
project: HSFCondDB
3+
title: HSF Conditions Database
4+
layout: default
5+
description: |
6+
In high-energy physics (HEP), conditions databases play a critical role in managing non-event data.
7+
This includes calibration constants, alignment parameters, and detector conditions, which evolve over time.
8+
These databases ensure that analysis software can access the correct calibration and alignment data corresponding to
9+
the detector’s state at any given time, enabling accurate physics measurements.
10+
11+
The [HEP Software Foundation](https://hepsoftwarefoundation.org/) (HSF) proposes a Conditions Database reference
12+
for HEP and Nuclear Physics experiments around the world. Several experts have converged on a common design for
13+
conditions data access management [arXiv:1901.05429](https://arxiv.org/abs/1901.05429).
14+
The [nopayloaddb](https://github.com/BNLNPPS/nopayloaddb) is an implementation of this reference. It has been
15+
successfully operating within the [sPHENIX](https://www.sphenix.bnl.gov/) experiment for nearly two years
16+
and is currently being adopted by [Belle II](https://www.belle2.org/). Additionally, other collaborations, including
17+
[ePIC](https://www.bnl.gov/eic/epic.php) and the [Einstein Telescope](https://www.et-gw.eu/), have expressed interest
18+
in evaluating its suitability for their needs.
19+
summary: |
20+
The [Nopayloaddb](https://github.com/BNLNPPS/nopayloaddb) is an implementation of the HSF
21+
Conditions Database reference, an experiment-agnostic design for conditions data access management.
22+
---
23+
24+
{% include gsoc_project.ext %}
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
---
2+
title: Intelligent Log Analysis for the HSF Conditions Database
3+
layout: gsoc_proposal
4+
project: HSFCondDB
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- BNL
11+
---
12+
13+
## Description
14+
15+
The [nopayloaddb](https://github.com/BNLNPPS/nopayloaddb) project works as an implementation of the Conditions Database
16+
reference for the HSF. It provides a RESTful API for managing payloads, global tags, payload types, and associated data.
17+
18+
Our current system, composed of Nginx, Django, and database ([link to helm chart](https://github.com/BNLNPPS/nopayloaddb-charts)),
19+
lacks a centralized logging solution making it difficult to effectively monitor and troubleshoot issues.
20+
This task will address this deficiency by implementing a centralized logging system aggregating logs from multiple
21+
components, and develop a machine learning model to perform intelligent log analysis. The model will identify unusual
22+
log entries indicative of software bugs, database bottlenecks, or other performance issues, allowing us to address
23+
problems before they escalate. Additionally, by analyzing system metrics, the model will provide insights for an optimal
24+
adjustment of parameters during periods of increased request rates.
25+
26+
## Steps
27+
28+
1. Set up a centralized logging system
29+
2. Collect and structure logs from Nginx, Django, and the database
30+
3. Develop an ML model for log grouping and anomaly detection
31+
4. Implement Kubernetes-based database with replication
32+
5. Train an ML model to optimize Kubernetes parameters dynamically
33+
34+
35+
## Expected Results
36+
37+
* A centralized logging system for improved monitoring and troubleshooting
38+
* ML-powered anomaly detection
39+
* ML-driven dynamic configuration for optimal performance
40+
41+
## Requirements
42+
43+
* Python and basic understanding of ML frameworks
44+
* Kubernetes, basic understanding, k8s, Helm, Operators, OpenShift
45+
* Django and Nginx, basic understanding of web frameworks and logging
46+
* Database knowledge, PostgreSQL, database replication
47+
48+
49+
## Mentors
50+
51+
- **Ruslan Mashinistov [[email protected]](mailto:[email protected]) BNL**
52+
- John S. De Stefano Jr. [[email protected]](mailto:[email protected]) BNL
53+
- Michel Hernandez Villanueva [[email protected]](mailto:[email protected]) BNL
54+
55+
56+
## Links
57+
58+
* Django REST API: https://github.com/BNLNPPS/nopayloaddb
59+
* Automized deployment with helm-chart: https://github.com/BNLNPPS/nopayloaddb-charts

gsoc/2025/mentors.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,18 @@ layout: plain
1919
* David Lange [[email protected]](mailto:[email protected]) CompRes
2020
* Serguei Linev [[email protected]](mailto:[email protected]) GSI
2121
* Johan Mabille [[email protected]](mailto:[email protected]) QuantStack
22+
* Ruslan Mashinistov [[email protected]](mailto:[email protected]) BNL
2223
* Peter McKeown [[email protected]](mailto:[email protected]) CERN
2324
* Felice Pantaleo [[email protected]](mailto:[email protected]) CERN
2425
* Giacomo Parolini [[email protected]](mailto:[email protected]) CERN
2526
* Alexander Penev [[email protected]](mailto:[email protected]) CompRes/University of Plovdiv, BG
2627
* Mayank Sharma [[email protected]](mailto:[email protected]) UMich
2728
* Simon Spannagel [[email protected]](mailto:[email protected]) DESY
29+
* John De Stefano [[email protected]](mailto:[email protected]) BNL
2830
* Graeme Stewart [[email protected]](mailto:[email protected]) CERN
2931
* Maciej Szymański [[email protected]](mailto:[email protected]) ANL
3032
* Peter Van Gemmeren [[email protected]](mailto:[email protected]) ANL
3133
* Martin Vasilev [[email protected]](mailto:[email protected]) University of Plovdiv, BG
3234
* Vassil Vassilev [[email protected]](mailto:[email protected]) CompRes
35+
* Michel Hernandez Villanueva [[email protected]](mailto:[email protected]) BNL
3336
* Valentin Volkl [[email protected]](mailto:[email protected]) CERN

images/BNL-logo.png

-3.77 KB
Loading

0 commit comments

Comments
 (0)