Skip to content

Conversation

@MattShirley
Copy link
Collaborator

@MattShirley MattShirley commented Sep 23, 2025

Backend work for supporting custom TopCP images from whitelisted Docker registry repositories: ssl-hep/ServiceX_frontend#658

Allowed images can be defined using the app.allowedDockerRegistries value, where each key should be a Docker registry and each value should be a list of repositories/images to validate against. Each string in the list will be tested to see if it is a valid prefix of the custom image. If it is, it will be allowed. The default value will allow any tags for the Docker Hub sslhep/ repository.

app:
  allowedImagePrefixes:
    - "sslhep/"
    - "docker.io/sslhep"

Sample files that work with a custom TopCP image but not with the default/base image:

# reco.yaml
CommonServices:
  runSystematics: False

PileupReweighting: {}

EventCleaning: {}

Trigger:
    triggerChainsPerYear:
        '2015':
            - 'HLT_2mu6_bBmumuxv2'
            - 'HLT_2mu4_bBmumuxv2'
        '2016':
            - 'HLT_2mu6_bBmumux_BsmumuPhi_delayed'
            - 'HLT_2mu6_bBmumux_BsmumuPhi_delayed_L1BPH-2M9-2MU6_BPH-2DR15-2MU6'
            - 'HLT_2mu6_bBmumux_BsmumuPhi_L1BPH-2M9-2MU6_BPH-2DR15-2MU6'
            - 'HLT_mu10_mu6_bBmumux_BsmumuPhi_delayed'
            - 'HLT_2mu6_bBmumuxv2'
            - 'HLT_2mu6_bBmumuxv2_delayed'
        '2017':
            - 'HLT_mu6_mu4_bBmumux_BsmumuPhi_L1BPH-2M9-MU6MU4_BPH-0DR15-MU6MU4'
            - 'HLT_2mu6_bBmumux_BsmumuPhi_L1BPH-2M9-2MU6_BPH-2DR15-2MU6'
            - 'HLT_2mu6_bJpsimumu_L1BPH-2M9-2MU6_BPH-2DR15-2MU6'
            - 'HLT_2mu10_bBmumuxv2'
        '2018':
            - 'HLT_mu6_mu4_bBmumux_BsmumuPhi_L1BPH-2M9-MU6MU4_BPH-0DR15-MU6MU4'
            - 'HLT_2mu6_bBmumux_BsmumuPhi_L1BPH-2M9-2MU6_BPH-2DR15-2MU6'
            - 'HLT_2mu6_bJpsimumu_L1BPH-2M9-2MU6_BPH-2DR15-2MU6'
            - 'HLT_2mu10_bBmumuxv2'
    noFilter: True
    noGlobalTriggerEff: True

Muons:
  - containerName: AnaMuons
    WorkingPoint:
      - selectionName: loose
        quality: Loose
        isolation: NonIso
      - selectionName: medium
        quality: Medium
        isolation: NonIso

BPhyVertex:
  - vertexName: 'BPHY28BsKKMuMuCandidates'
    decorations: []


BPHY28:
  - vertexName: 'BPHY28BsKKMuMuCandidates'
    inputMuons: 'AnaMuons'
    decorateTrackParameters: True
    decoratePixelHits: True
    decorateDedx: True

# After configuring each container, many variables will be saved automatically.
Output:
  treeName: 'reco'
  vars: []
  metVars: []
  containers:
      # Format should follow: '<suffix>:<output container>'
#      vtx_: 'BPHY28BsKKMuMuCandidates'
      mu_: 'AnaMuons'
      '': 'EventInfo'
  commands: []

AddConfigBlocks:
  - modulePath: 'TopCPToolkit.BPHY28Config'
    functionName: 'BPHY28Config'
    algName: 'BPHY28'
    pos: 'Output'
  - modulePath: 'TopCPToolkit.BPhyVertexConfig'
    functionName: 'BPhyVertexConfig'
    algName: 'BPhyVertex'
    pos: 'Output'
# spec.yaml
General:
  OutFilesetName: "fileset_ntuple_v1_af"
  OutputDirectory: "."

Sample:
  ## data18
  - Name: data18_periodD
    Dataset: !Rucio data18_13TeV:DAOD_BPHY28.44180527._000001.pool.root.1
    Query: !TopCP |
      reco=reco.yaml,
      image=sslhep/servicex_science_image_topcp:25.2.50-v2.18.0_v0.2

@MattShirley MattShirley linked an issue Sep 23, 2025 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Oct 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.56%. Comparing base (5905e54) to head (53b3af0).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1163      +/-   ##
===========================================
+ Coverage    86.25%   86.56%   +0.30%     
===========================================
  Files           95       95              
  Lines         3281     3304      +23     
  Branches       376      381       +5     
===========================================
+ Hits          2830     2860      +30     
+ Misses         377      373       -4     
+ Partials        74       71       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MattShirley MattShirley changed the title support custom TopCP images support custom science images Jan 6, 2026
@MattShirley
Copy link
Collaborator Author

@ponyisi this PR has been updated to support CERN's GitLab instance. It also now validates all science images.

if registry == "docker.io":
return self._dockerhub_get_image_by_tag(repo, image, tag)

if registry == "gitlab-registry.cern.ch":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really no way to make this generic? This is pretty 🤢

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added crane to the image and simplified this considerably

sqlalchemyEngineOptions: null
allowedDockerRegistries:
"docker.io":
allowedImagePrefixes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a prefix mandatory or can this just be "" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now a prefix is mandatory however it can just be "". I don't think we should include anything with "" in the default values.yaml but a cluster operator could choose to do so. Do you think we should keep KyungEon's GitLab repository by default? @kyungeonchoi do you know how easy it would be to setup a shared namespace in CERN's GitLab like sslhep-customimages that isn't tied to a specific user name?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd tend to say we just whitelist the full gitlab-registry.cern.ch by default


request_rec.image = codegen_transformer_image
try:
jquery = json.loads(args["selection"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not generically going to work for all codegens, many do not use JSON for their selection and even when they are they do not all have an "image" key.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been moved to the TopCP Code Gen's standard image setting logic.

registry = jquery.get("registry", "docker.io")

transformer_image = codegen_transformer_image
if "image" in jquery:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I don't really understand the point of doing this. If the codegen is doing the correct thing, codegen_transformer_image, which is under the control of the codegen which is interpreting the selection request, will have the image we want. We should not have any logic here that tries to peek into the selection, this must be kept opaque at this level. Basically, we should take codegen_transformer_image and prefix it with docker.io if necessary, that's it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been moved to the TopCP Code Gen's standard image setting logic. Now the code moves immediately to validate the image name.

sqlalchemyEngineOptions: null
allowedDockerRegistries:
"docker.io":
allowedImagePrefixes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd tend to say we just whitelist the full gitlab-registry.cern.ch by default

Copy link
Collaborator

@ponyisi ponyisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's go

@ponyisi ponyisi merged commit 986a0da into develop Jan 16, 2026
137 of 152 checks passed
@ponyisi ponyisi deleted the 1162-support-custom-topcp-images branch January 16, 2026 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support custom TopCP images

5 participants