Skip to content

Populate validated_runs field in records Closes #3746#3749

Closed
IssaAlBawwab wants to merge 4 commits intocernopendata:masterfrom
IssaAlBawwab:master
Closed

Populate validated_runs field in records Closes #3746#3749
IssaAlBawwab wants to merge 4 commits intocernopendata:masterfrom
IssaAlBawwab:master

Conversation

@IssaAlBawwab
Copy link
Copy Markdown
Member

This commit adds the new validated_runs field to all relevant data records.

A data migration script was run to parse the existing abstract field and create the new structured data. The script correctly identifies records that mention either "validated runs" or "validated lumi sections," ensuring consistency across all datasets.

To maintain backward compatibility, the original abstract field and its links have been preserved. This change only enriches the records with the new, machine-readable field.

Closes #3746

This commit adds the new `validated_runs` field to all relevant data records.

A data migration script was run to parse the existing `abstract` field and create the new structured data. The script correctly identifies records that mention either "validated runs" or "validated lumi sections," ensuring consistency across all datasets.

To maintain backward compatibility, the original `abstract` field and its links have been preserved. This change only enriches the records with the new, machine-readable field.

Closes cernopendata#3746
{
"recid": "14221",
"validation": "full"
}
Copy link
Copy Markdown
Member

@katilp katilp Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something weird here. 14220 is full 14221 is muons only

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How am I supposed to know which is which, the description doesnt specify?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
"recid": "14201",
"validation": "full"
}
Copy link
Copy Markdown
Member

@katilp katilp Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same: no two full validated_run entries for one record. 14200 is for 0.9 TeV sample, 14201 is for 7 TeV sample

},
{
"recid": "14221",
"validation": "full"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14221 is muons only

},
"methodology": {
"description": "<p>Events stored in this primary dataset were selected because of the presence of at least two muons in the event requiring only very low energy and some requirements in the invariant mass for selecting low-mass resonances.</p><p><strong>Data taking / HLT</strong><br/>The collision data were assigned to different RAW datasets using the following <a href=\"/record/1701\">HLT configuration</a>.</p><p><strong>Data processing / RECO</strong><br/>This primary AOD dataset was processed from the RAW dataset by the following step: <br/>Step: RECO<br/>Release: CMSSW_5_3_7_patch5<br/>Global tag: FT_R_53_V18::All\n <br/><a href=\"/record/7092\">Configuration file for RECO step reco_2012A_MuOnia</a>\n </p><p><strong>HLT trigger paths</strong><br/>The possible <a href=\"/docs/cms-guide-trigger-system#hlt-trigger-path-definitions\">HLT trigger paths</a> in this dataset are:<br/><a href=\"/search?q=HLT_Dimuon0_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Jpsi</a><br/><a href=\"/search?q=HLT_Dimuon0_Jpsi_Muon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Jpsi_Muon</a><br/><a href=\"/search?q=HLT_Dimuon0_Jpsi_NoVertexing&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Jpsi_NoVertexing</a><br/><a href=\"/search?q=HLT_Dimuon0_PsiPrime&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_PsiPrime</a><br/><a href=\"/search?q=HLT_Dimuon0_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Upsilon</a><br/><a href=\"/search?q=HLT_Dimuon0_Upsilon_Muon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Upsilon_Muon</a><br/><a href=\"/search?q=HLT_Dimuon11_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon11_Upsilon</a><br/><a href=\"/search?q=HLT_Dimuon3p5_SameSign&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon3p5_SameSign</a><br/><a href=\"/search?q=HLT_Dimuon5_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon5_Upsilon</a><br/><a href=\"/search?q=HLT_Dimuon7_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon7_Upsilon</a><br/><a href=\"/search?q=HLT_Dimuon8_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon8_Upsilon</a><br/><a href=\"/search?q=HLT_DoubleMu3_4_Dimuon5_Bs_Central&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu3_4_Dimuon5_Bs_Central</a><br/><a href=\"/search?q=HLT_DoubleMu3p5_4_Dimuon5_Bs_Central&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu3p5_4_Dimuon5_Bs_Central</a><br/><a href=\"/search?q=HLT_DoubleMu4_Dimuon7_Bs_Forward&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu4_Dimuon7_Bs_Forward</a><br/><a href=\"/search?q=HLT_DoubleMu4_JpsiTk_Displaced&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu4_JpsiTk_Displaced</a><br/><a href=\"/search?q=HLT_DoubleMu4_Jpsi_Displaced&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu4_Jpsi_Displaced</a><br/><a href=\"/search?q=HLT_Mu5_L2Mu3_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Mu5_L2Mu3_Jpsi</a><br/><a href=\"/search?q=HLT_Mu5_Track2_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Mu5_Track2_Jpsi</a><br/><a href=\"/search?q=HLT_Mu5_Track3p5_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Mu5_Track3p5_Jpsi</a><br/><a href=\"/search?q=HLT_Mu7_Track7_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Mu7_Track7_Jpsi</a><br/><a href=\"/search?q=HLT_Tau2Mu_ItTrack&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Tau2Mu_ItTrack</a></p>"
"description": "<p>Events stored in this primary dataset were selected because of the presence of at least two\u00a0muons\u00a0in the event requiring only very low energy and some requirements in the invariant mass for selecting low-mass resonances.</p><p><strong>Data taking / HLT</strong><br/>The collision data were assigned to different RAW datasets using the following <a href=\"/record/1701\">HLT configuration</a>.</p><p><strong>Data processing / RECO</strong><br/>This primary AOD dataset was processed from the RAW dataset by the following step: <br/>Step: RECO<br/>Release: CMSSW_5_3_7_patch5<br/>Global tag: FT_R_53_V18::All\n <br/><a href=\"/record/7092\">Configuration file for RECO step reco_2012A_MuOnia</a>\n </p><p><strong>HLT trigger paths</strong><br/>The possible <a href=\"/docs/cms-guide-trigger-system#hlt-trigger-path-definitions\">HLT trigger paths</a> in this dataset are:<br/><a href=\"/search?q=HLT_Dimuon0_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Jpsi</a><br/><a href=\"/search?q=HLT_Dimuon0_Jpsi_Muon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Jpsi_Muon</a><br/><a href=\"/search?q=HLT_Dimuon0_Jpsi_NoVertexing&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Jpsi_NoVertexing</a><br/><a href=\"/search?q=HLT_Dimuon0_PsiPrime&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_PsiPrime</a><br/><a href=\"/search?q=HLT_Dimuon0_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Upsilon</a><br/><a href=\"/search?q=HLT_Dimuon0_Upsilon_Muon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon0_Upsilon_Muon</a><br/><a href=\"/search?q=HLT_Dimuon11_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon11_Upsilon</a><br/><a href=\"/search?q=HLT_Dimuon3p5_SameSign&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon3p5_SameSign</a><br/><a href=\"/search?q=HLT_Dimuon5_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon5_Upsilon</a><br/><a href=\"/search?q=HLT_Dimuon7_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon7_Upsilon</a><br/><a href=\"/search?q=HLT_Dimuon8_Upsilon&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Dimuon8_Upsilon</a><br/><a href=\"/search?q=HLT_DoubleMu3_4_Dimuon5_Bs_Central&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu3_4_Dimuon5_Bs_Central</a><br/><a href=\"/search?q=HLT_DoubleMu3p5_4_Dimuon5_Bs_Central&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu3p5_4_Dimuon5_Bs_Central</a><br/><a href=\"/search?q=HLT_DoubleMu4_Dimuon7_Bs_Forward&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu4_Dimuon7_Bs_Forward</a><br/><a href=\"/search?q=HLT_DoubleMu4_JpsiTk_Displaced&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu4_JpsiTk_Displaced</a><br/><a href=\"/search?q=HLT_DoubleMu4_Jpsi_Displaced&subtype=Trigger&type=Supplementaries&year=2012\">HLT_DoubleMu4_Jpsi_Displaced</a><br/><a href=\"/search?q=HLT_Mu5_L2Mu3_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Mu5_L2Mu3_Jpsi</a><br/><a href=\"/search?q=HLT_Mu5_Track2_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Mu5_Track2_Jpsi</a><br/><a href=\"/search?q=HLT_Mu5_Track3p5_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Mu5_Track3p5_Jpsi</a><br/><a href=\"/search?q=HLT_Mu7_Track7_Jpsi&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Mu7_Track7_Jpsi</a><br/><a href=\"/search?q=HLT_Tau2Mu_ItTrack&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Tau2Mu_ItTrack</a></p>"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some unintended editing? Note two\u00a0muons\u00a0in instead of two muons

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encoding problem, ill fix it

},
"methodology": {
"description": "<p>Events stored in this primary dataset were selected because of the presence of low-energy particles (soft-QCD events). This trigger is favorable over the ZeroBias one when the luminosity, and thus the rate of collisions to tape, is too low for ZeroBias to be effective.</p><p><strong>Data taking / HLT</strong><br/>The collision data were assigned to different RAW datasets using the following <a href=\"/record/1701\">HLT configuration</a>.</p><p><strong>Data processing / RECO</strong><br/>This primary AOD dataset was processed from the RAW dataset by the following step: <br/>Step: RECO<br/>Release: CMSSW_5_3_7_patch5<br/>Global tag: FT_R_53_V18::All\n <br/><a href=\"/record/7026\">Configuration file for RECO step reco_2012B_MinimumBias</a>\n </p><p><strong>HLT trigger paths</strong><br/>The possible <a href=\"/docs/cms-guide-trigger-system#hlt-trigger-path-definitions\">HLT trigger paths</a> in this dataset are:<br/><a href=\"/search?q=HLT_JetE30_NoBPTX&subtype=Trigger&type=Supplementaries&year=2012\">HLT_JetE30_NoBPTX</a><br/><a href=\"/search?q=HLT_JetE30_NoBPTX3BX_NoHalo&subtype=Trigger&type=Supplementaries&year=2012\">HLT_JetE30_NoBPTX3BX_NoHalo</a><br/><a href=\"/search?q=HLT_JetE50_NoBPTX3BX_NoHalo&subtype=Trigger&type=Supplementaries&year=2012\">HLT_JetE50_NoBPTX3BX_NoHalo</a><br/><a href=\"/search?q=HLT_JetE70_NoBPTX3BX_NoHalo&subtype=Trigger&type=Supplementaries&year=2012\">HLT_JetE70_NoBPTX3BX_NoHalo</a><br/><a href=\"/search?q=HLT_Physics&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Physics</a><br/><a href=\"/search?q=HLT_Physics5E33_2b&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Physics5E33_2b</a><br/><a href=\"/search?q=HLT_Physics5E33_48b&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Physics5E33_48b</a><br/><a href=\"/search?q=HLT_Physics5E33_84b&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Physics5E33_84b</a><br/><a href=\"/search?q=HLT_PixelTracks_Multiplicity70&subtype=Trigger&type=Supplementaries&year=2012\">HLT_PixelTracks_Multiplicity70</a><br/><a href=\"/search?q=HLT_PixelTracks_Multiplicity80&subtype=Trigger&type=Supplementaries&year=2012\">HLT_PixelTracks_Multiplicity80</a><br/><a href=\"/search?q=HLT_PixelTracks_Multiplicity90&subtype=Trigger&type=Supplementaries&year=2012\">HLT_PixelTracks_Multiplicity90</a><br/><a href=\"/search?q=HLT_Random&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Random</a><br/><a href=\"/search?q=HLT_ZeroBias&subtype=Trigger&type=Supplementaries&year=2012\">HLT_ZeroBias</a><br/><a href=\"/search?q=HLT_ZeroBiasPixel_DoubleTrack&subtype=Trigger&type=Supplementaries&year=2012\">HLT_ZeroBiasPixel_DoubleTrack</a></p>"
"description": "<p>Events stored in this primary dataset were selected because of the presence of low-energy particles (soft-QCD events). This\u00a0trigger\u00a0is favorable over the ZeroBias one when the luminosity, and thus the rate of collisions to tape, is too low for ZeroBias to be effective.</p><p><strong>Data taking / HLT</strong><br/>The collision data were assigned to different RAW datasets using the following <a href=\"/record/1701\">HLT configuration</a>.</p><p><strong>Data processing / RECO</strong><br/>This primary AOD dataset was processed from the RAW dataset by the following step: <br/>Step: RECO<br/>Release: CMSSW_5_3_7_patch5<br/>Global tag: FT_R_53_V18::All\n <br/><a href=\"/record/7026\">Configuration file for RECO step reco_2012B_MinimumBias</a>\n </p><p><strong>HLT trigger paths</strong><br/>The possible <a href=\"/docs/cms-guide-trigger-system#hlt-trigger-path-definitions\">HLT trigger paths</a> in this dataset are:<br/><a href=\"/search?q=HLT_JetE30_NoBPTX&subtype=Trigger&type=Supplementaries&year=2012\">HLT_JetE30_NoBPTX</a><br/><a href=\"/search?q=HLT_JetE30_NoBPTX3BX_NoHalo&subtype=Trigger&type=Supplementaries&year=2012\">HLT_JetE30_NoBPTX3BX_NoHalo</a><br/><a href=\"/search?q=HLT_JetE50_NoBPTX3BX_NoHalo&subtype=Trigger&type=Supplementaries&year=2012\">HLT_JetE50_NoBPTX3BX_NoHalo</a><br/><a href=\"/search?q=HLT_JetE70_NoBPTX3BX_NoHalo&subtype=Trigger&type=Supplementaries&year=2012\">HLT_JetE70_NoBPTX3BX_NoHalo</a><br/><a href=\"/search?q=HLT_Physics&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Physics</a><br/><a href=\"/search?q=HLT_Physics5E33_2b&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Physics5E33_2b</a><br/><a href=\"/search?q=HLT_Physics5E33_48b&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Physics5E33_48b</a><br/><a href=\"/search?q=HLT_Physics5E33_84b&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Physics5E33_84b</a><br/><a href=\"/search?q=HLT_PixelTracks_Multiplicity70&subtype=Trigger&type=Supplementaries&year=2012\">HLT_PixelTracks_Multiplicity70</a><br/><a href=\"/search?q=HLT_PixelTracks_Multiplicity80&subtype=Trigger&type=Supplementaries&year=2012\">HLT_PixelTracks_Multiplicity80</a><br/><a href=\"/search?q=HLT_PixelTracks_Multiplicity90&subtype=Trigger&type=Supplementaries&year=2012\">HLT_PixelTracks_Multiplicity90</a><br/><a href=\"/search?q=HLT_Random&subtype=Trigger&type=Supplementaries&year=2012\">HLT_Random</a><br/><a href=\"/search?q=HLT_ZeroBias&subtype=Trigger&type=Supplementaries&year=2012\">HLT_ZeroBias</a><br/><a href=\"/search?q=HLT_ZeroBiasPixel_DoubleTrack&subtype=Trigger&type=Supplementaries&year=2012\">HLT_ZeroBiasPixel_DoubleTrack</a></p>"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\u00a0trigger\u00a0 ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encoding problem

Copy link
Copy Markdown
Member

@katilp katilp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Please check the comments

  • there are some cases were there are two full validated_runs fields and there should be only one
  • check the unintended changes to the methodology description in many records, I pointed out a few but there are more.

IssaAlBawwab and others added 3 commits July 3, 2025 11:59
A data migration script was run to parse the existing `abstract` field
and create the new structured data. The script correctly identifies
records that mention either "validated runs" or "validated lumi
sections," ensuring consistency across all datasets.

To maintain backward compatibility, the original `abstract` field and
its links have been preserved. This change only enriches the records
with the new, machine-readable field.

Closes cernopendata#3746
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RFC a new field to store information about validated runs

2 participants