Updates for condor on CMS Connect in singularity#3845
Updates for condor on CMS Connect in singularity#3845bryates wants to merge 5 commits intocms-sw:mg265ULfrom
Conversation
There was a problem hiding this comment.
this does not affect phase space integration?
There was a problem hiding this comment.
No, this is the step after when EFT reweighting is done. I can remove this too since it's not for general use.
There was a problem hiding this comment.
This is due to a bug in madgraph that doesn't work when nb_core is not 1? If so, it might be worth keeping it
There was a problem hiding this comment.
this might no longer be the case, but when the reweighting step gets run using a multicore setup it would crash or otherwise incorrectly combine the jobs produced by each core.
From what I remember it seemed like some sort of race condition where the code would crash if the first/initial job didn't finish first and would complete when the first job finished first, but in this case I was skeptical of the output since it seemed like the subsequent jobs would overwrite the output in a way that did not seem correct (e.g. the output file size seemed to bounce around as the jobs completed with no indication that there was some sort of post-job merging going on).
The other reason to limit the number of cores explicitly is b/c otherwise madgraph will default to try and use as many cores as possible on the host machine and these gridpacks were produced using CMSConnect, which meant that the reweighting step would end up saturating every possible core on the submission node for CMSConnect, which would be very bad.
There was a problem hiding this comment.
Ah... yes madgraph had this bug for the initial phase space integration long time ago (eating up all cores) but looks like for reweighting this behavior is still there in 2.6.5 (not sure about 2.9.x versions).
@bbilin @lviliani maybe consider this as well in master branch? or i am raising this up again which @DickyChant and I were hoping for : it's a lot easier just to keep master as the default for all campaigns instead of spending time to fix sth on both branches and deprecate old ones if the analyses are just new
There was a problem hiding this comment.
el9 gridpacks won't work for official production yet (that's why we just didn't care about el9 so much in the script), or is the container-in-container prepared? @DickyChant
There was a problem hiding this comment.
I think we should officially disable el9 instead of making it work like this (el9 gridpacks are not going to be usable for official production) , talked to sitian
There was a problem hiding this comment.
this is not safe since not all people are going to be using a model that lives under addons directory and tar command will eventually crash if it does not exist. why not just steer it with extramodels.dat ?
this will be wgetted from link
There was a problem hiding this comment.
Thanks for explaining, I've removed this. Our current model is not in https://cms-project-generators.web.cern.ch/cms-project-generators/. Can we add it or can only people in the gen group make changes?
There was a problem hiding this comment.
Should ask GEN conveners
|
Hello, as we discussed this PR should go in the master branch as well. |
This PR fixes condor on CMS Connect. The
cmssw-cc7-condor-python27singularity container is required to run, and must be activated from the topgenproductionsdirectory.