Skip to content

Commit 872efbc

Browse files
committed
Add atlasopenmagic DID finder
1 parent e9f9074 commit 872efbc

File tree

16 files changed

+1298
-0
lines changed

16 files changed

+1298
-0
lines changed

.github/workflows/deploy-config.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,11 @@
1414
"image_name": "servicex-did-finder-cernopendata",
1515
"test_required": true
1616
},
17+
{
18+
"dir_name": "did_finder_atlasopenmagic",
19+
"image_name": "servicex-did-finder-atlasopenmagic",
20+
"test_required": true
21+
},
1722
{
1823
"dir_name": "did_finder_rucio",
1924
"image_name": "servicex-did-finder",
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
[run]
2+
branch = True
3+
source = src/
4+
5+
[report]
6+
exclude_lines =
7+
if self.debug:
8+
pragma: no cover
9+
raise NotImplementedError
10+
if __name__ == .__main__.:
11+
ignore_errors = True
12+
omit =
13+
tests/*

did_finder_atlasopenmagic/.flake8

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[flake8]
2+
max-line-length=99
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Auto detect text files and perform LF normalization
2+
* text=auto
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
*.egg-info/
24+
.installed.cfg
25+
*.egg
26+
MANIFEST
27+
28+
# PyInstaller
29+
# Usually these files are written by a python script from a template
30+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
31+
*.manifest
32+
*.spec
33+
34+
# Installer logs
35+
pip-log.txt
36+
pip-delete-this-directory.txt
37+
38+
# Unit test / coverage reports
39+
htmlcov/
40+
.tox/
41+
.nox/
42+
.coverage
43+
.coverage.*
44+
.cache
45+
nosetests.xml
46+
coverage.xml
47+
*.cover
48+
.hypothesis/
49+
.pytest_cache/
50+
51+
# Translations
52+
*.mo
53+
*.pot
54+
55+
# Django stuff:
56+
*.log
57+
local_settings.py
58+
db.sqlite3
59+
60+
# Flask stuff:
61+
instance/
62+
.webassets-cache
63+
64+
# Scrapy stuff:
65+
.scrapy
66+
67+
# Sphinx documentation
68+
docs/_build/
69+
70+
# PyBuilder
71+
target/
72+
73+
# Jupyter Notebook
74+
.ipynb_checkpoints
75+
76+
# IPython
77+
profile_default/
78+
ipython_config.py
79+
80+
# pyenv
81+
.python-version
82+
83+
# celery beat schedule file
84+
celerybeat-schedule
85+
86+
# SageMath parsed files
87+
*.sage.py
88+
89+
# Environments
90+
.env
91+
.venv
92+
env/
93+
venv/
94+
ENV/
95+
env.bak/
96+
venv.bak/
97+
98+
# Spyder project settings
99+
.spyderproject
100+
.spyproject
101+
102+
# Rope project settings
103+
.ropeproject
104+
105+
# mkdocs documentation
106+
/site
107+
108+
# mypy
109+
.mypy_cache/
110+
.dmypy.json
111+
dmypy.json
112+
113+
# Pyre type checker
114+
.pyre/
115+
.vscode/settings.json
116+
117+
.idea/
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
FROM python:3.14
2+
3+
LABEL maintainer="Peter Onyisi <[email protected]>"
4+
5+
USER root
6+
7+
# Create app directory
8+
WORKDIR /opt/servicex
9+
10+
# Create celery user. Assign them to group zero since that is the group OpenShift will run the container as
11+
RUN useradd -g 0 -ms /bin/bash celery
12+
13+
ENV POETRY_VERSION=2.3
14+
15+
RUN pip install poetry==$POETRY_VERSION
16+
17+
COPY pyproject.toml pyproject.toml
18+
COPY poetry.lock poetry.lock
19+
20+
# Bring over the main scripts
21+
COPY . .
22+
23+
ENV XDG_CONFIG_HOME=/opt/servicex
24+
RUN poetry config virtualenvs.in-project true
25+
RUN poetry install --no-root --no-interaction --no-ansi
26+
27+
# Change ownership of the app directory to celery user. Also set the group to zero since
28+
# that is the group OpenShift will run the container as
29+
RUN chown -R celery:0 /opt/servicex
30+
RUN chmod -R g=u /opt/servicex
31+
32+
# Switch to celery user
33+
USER celery
34+
35+
# Make sure python isn't buffered
36+
ENV PYTHONUNBUFFERED=1
37+
38+
ENV PYTHONPATH=/opt/servicex/src
39+
ENV BROKER_URL="amqp://guest:guest@localhost:5672//"
40+
41+
ENTRYPOINT [ "scripts/start_celery_worker.sh"]
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# ServiceX_DID_finder_ATLASOpenMagic
2+
Access datasets for ServiceX with `atlasopenmagic`.
3+
4+
## Finding datasets
5+
6+
The ATLAS experiment hosts releases of [ATLAS Open Data](https://opendata.atlas.cern/). The proton-proton simulations and data are ROOT files, which come in two flavors: [education](https://opendata.atlas.cern/docs/category/data-for-education) and [research](https://opendata.atlas.cern/docs/category/data-for-research). The former are flat ROOT files with a limited set of branches, and the latter are in the PHYSLITE format which is also readable via `uproot` although with a more complicated schema (the `coffea` PHYSLITE schema is recommended for analysis). There are additional datasets for heavy ion collisions (also ROOT) and pure event generator output (in HEPMC format).
7+
8+
The [`atlasopenmagic`](https://github.com/atlas-outreach-data-tools/atlasopenmagic) package is provided as an interface to this data. To look up the files for a dataset, you must provide the _release_ for the dataset, as well as the _dataset ID_ (this is either a numeric string, corresponding to the ATLAS Monte Carlo simulation sample ID, or the string "data" for data). Optionally, where supported by the release, a _skim_ can be specifed. The `atlasopenmagic` DID finder accepts a single string encoding all these, which must be of the form `<release>/<dataset_id>` or `<release>/<dataset_id>/<skim>` where the appropriate replacements are made. The assumption is that the user will use `atlasopenmagic` directly in their code to handle the metadata aspects of using the ATLAS Open Data (e.g.\ sample cross sections, initial number of events, etc.) while using ServiceX to apply event selection and column reduction to the files.
9+
10+
Check the `helm/servicex/templates/did-finder-atlasopenmagic/deployment.yaml` file for an example of how to deploy this DID finder.

0 commit comments

Comments
 (0)