-
Notifications
You must be signed in to change notification settings - Fork 351
HSF Conditions DB Project Proposal for GSoC 2025 #1680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
20e2ffa
Intelligent Log Analysis for the HSF CondDB
michmx 487ec7e
Merge branch 'main' into gsoc-hsfconddb-2025
vvolkl 28676d2
Adding a summary for the project_HSFCondDB.md
michmx c222e35
Merge branch 'gsoc-hsfconddb-2025' of github.com:HSF/hsf.github.io in…
michmx e0e2e43
Fixing order in mentors.md
michmx 1e15b14
Adding links to the nopayload project
michmx 64254a9
Adding links to the nopayload project
michmx cc892bf
Recovering old BNL logo as it is used in pages from previous years
michmx f8ed760
Recovering description from past years for BNL.
michmx File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
--- | ||
title: "Brookhaven National Laboratory" | ||
author: "Michel Hernandez Villanueva" | ||
layout: default | ||
organization: BNL | ||
logo: BNL-logo.png | ||
description: | | ||
Brookhaven National Laboratory (BNL) is a multipurpose research laboratory located in Upton, New York. | ||
It is operated by Brookhaven Science Associates for the U.S. Department of Energy. | ||
It hosts the Relativistic Heavy Ion Collider, the future Electron-Ion Collider and the National Synchrotron Light Source II. | ||
BNL scientists are part of major HEP experiments, such as ATLAS, Belle II, and DUNE. | ||
--- | ||
|
||
{% include gsoc_proposal.ext %} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
--- | ||
project: HSFCondDB | ||
title: HSF Conditions Database | ||
layout: default | ||
description: | | ||
In high-energy physics (HEP), conditions databases play a critical role in managing non-event data. | ||
This includes calibration constants, alignment parameters, and detector conditions, which evolve over time. | ||
These databases ensure that analysis software can access the correct calibration and alignment data corresponding to | ||
the detector’s state at any given time, enabling accurate physics measurements. | ||
The [HEP Software Foundation](https://hepsoftwarefoundation.org/) (HSF) proposes a Conditions Database reference | ||
for HEP and Nuclear Physics experiments around the world. Several experts have converged on a common design for | ||
conditions data access management [arXiv:1901.05429](https://arxiv.org/abs/1901.05429). | ||
The [nopayloaddb](https://github.com/BNLNPPS/nopayloaddb) is an implementation of this reference. It has been | ||
successfully operating within the [sPHENIX](https://www.sphenix.bnl.gov/) experiment for nearly two years | ||
and is currently being adopted by [Belle II](https://www.belle2.org/). Additionally, other collaborations, including | ||
[ePIC](https://www.bnl.gov/eic/epic.php) and the [Einstein Telescope](https://www.et-gw.eu/), have expressed interest | ||
in evaluating its suitability for their needs. | ||
summary: | | ||
The [Nopayloaddb](https://github.com/BNLNPPS/nopayloaddb) is an implementation of the HSF | ||
Conditions Database reference, an experiment-agnostic design for conditions data access management. | ||
--- | ||
|
||
{% include gsoc_project.ext %} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
--- | ||
title: Intelligent Log Analysis for the HSF Conditions Database | ||
layout: gsoc_proposal | ||
project: HSFCondDB | ||
year: 2025 | ||
difficulty: medium | ||
duration: 350 | ||
mentor_avail: June-October | ||
organization: | ||
- BNL | ||
--- | ||
|
||
## Description | ||
|
||
The [nopayloaddb](https://github.com/BNLNPPS/nopayloaddb) project works as an implementation of the Conditions Database | ||
reference for the HSF. It provides a RESTful API for managing payloads, global tags, payload types, and associated data. | ||
|
||
Our current system, composed of Nginx, Django, and database ([link to helm chart](https://github.com/BNLNPPS/nopayloaddb-charts)), | ||
lacks a centralized logging solution making it difficult to effectively monitor and troubleshoot issues. | ||
This task will address this deficiency by implementing a centralized logging system aggregating logs from multiple | ||
components, and develop a machine learning model to perform intelligent log analysis. The model will identify unusual | ||
log entries indicative of software bugs, database bottlenecks, or other performance issues, allowing us to address | ||
problems before they escalate. Additionally, by analyzing system metrics, the model will provide insights for an optimal | ||
adjustment of parameters during periods of increased request rates. | ||
|
||
## Steps | ||
|
||
1. Set up a centralized logging system | ||
2. Collect and structure logs from Nginx, Django, and the database | ||
3. Develop an ML model for log grouping and anomaly detection | ||
4. Implement Kubernetes-based database with replication | ||
5. Train an ML model to optimize Kubernetes parameters dynamically | ||
|
||
|
||
## Expected Results | ||
|
||
* A centralized logging system for improved monitoring and troubleshooting | ||
* ML-powered anomaly detection | ||
* ML-driven dynamic configuration for optimal performance | ||
|
||
## Requirements | ||
|
||
* Python and basic understanding of ML frameworks | ||
* Kubernetes, basic understanding, k8s, Helm, Operators, OpenShift | ||
* Django and Nginx, basic understanding of web frameworks and logging | ||
* Database knowledge, PostgreSQL, database replication | ||
|
||
|
||
## Mentors | ||
|
||
- **Ruslan Mashinistov [[email protected]](mailto:[email protected]) BNL** | ||
- John S. De Stefano Jr. [[email protected]](mailto:[email protected]) BNL | ||
- Michel Hernandez Villanueva [[email protected]](mailto:[email protected]) BNL | ||
|
||
|
||
## Links | ||
|
||
* Django REST API: https://github.com/BNLNPPS/nopayloaddb | ||
* Automized deployment with helm-chart: https://github.com/BNLNPPS/nopayloaddb-charts |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,15 +20,18 @@ layout: plain | |
* David Lange [[email protected]](mailto:[email protected]) CompRes | ||
* Serguei Linev [[email protected]](mailto:[email protected]) GSI | ||
* Johan Mabille [[email protected]](mailto:[email protected]) QuantStack | ||
* Ruslan Mashinistov [[email protected]](mailto:[email protected]) BNL | ||
* Peter McKeown [[email protected]](mailto:[email protected]) CERN | ||
* Felice Pantaleo [[email protected]](mailto:[email protected]) CERN | ||
* Giacomo Parolini [[email protected]](mailto:[email protected]) CERN | ||
* Alexander Penev [[email protected]](mailto:[email protected]) CompRes/University of Plovdiv, BG | ||
* Mayank Sharma [[email protected]](mailto:[email protected]) UMich | ||
* Simon Spannagel [[email protected]](mailto:[email protected]) DESY | ||
* John De Stefano [[email protected]](mailto:[email protected]) BNL | ||
* Graeme Stewart [[email protected]](mailto:[email protected]) CERN | ||
* Maciej Szymański [[email protected]](mailto:[email protected]) ANL | ||
* Peter Van Gemmeren [[email protected]](mailto:[email protected]) ANL | ||
* Martin Vasilev [[email protected]](mailto:[email protected]) University of Plovdiv, BG | ||
* Vassil Vassilev [[email protected]](mailto:[email protected]) CompRes | ||
* Michel Hernandez Villanueva [[email protected]](mailto:[email protected]) BNL | ||
* Valentin Volkl [[email protected]](mailto:[email protected]) CERN |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.