Skip to content

Latest commit

 

History

History
108 lines (98 loc) · 5.09 KB

File metadata and controls

108 lines (98 loc) · 5.09 KB

Processing of simulated data in Bender

Processing of simulation data in Bender is rather simple, one just needs to inherit the algorithm from base class AlgoMC, this class can be imported from Bender.MainMC module.

from Bender.MainMC import *  # it imports also the whole content of Bender.Main module 
class MyAlg(AlgoMC):

And the corresponsing wrapper for Selection-framework is BenderMCSelection

Important notes: Simulation=True and DDDB/SIMCOND-tags

  • One needs to use Simulation=True flag for DaVinci-configurable
from Configurables import DaVinci
dv = DaVinci ( Simulation      = True           , ## <--- HERE!
               ...
               TupleFile       = 'MCtruth.root' )
  • It is very important to specify the correct DDDB/SIMCOND-tags for the simulated data. It is very easy to get efficiencies wrong up to 30% if simulated data a processed with the wrong DDDB/SIMCOND-tags.
from Configurables import DaVinci
dv = DaVinci ( Simulation      = True                      , ##
               ...
               DDDBtag         = 'dddb-20130929-1'         , ## <--- HERE!
               CondDBtag       = 'sim-20130522-1-vc-mu100' , ## <--- HERE!
               ...
               TupleFile       = 'MCtruth.root' )

Correct DDDB/SIMCOND-tags can be retrived in several ways:

  1. from bookkeeping-DB for the given production type {% challenge "Challenge (only for those who knows how to do it)" %} Do you know how to do it? If so make a try to use this way.
  • Please use the timer for comparison. {% endchallenge %}
  1. using the helper Bender scripts get-dbtags or get-metainfo for the given file {% challenge "Challenge" %} Try to use these scripts form the command line.
  • Start with get-dbtags -h and get-metainfo -h and follow the instructions. {% solution "Solution" %}
<script src="https://gist.github.com/VanyaBelyaev/8e316f81caaccda69cb3b7ced2abd5d5.js"/></script>

{% endchallenge %} 3. using dirac-bookkeeping-decays-path script from LHCbDirac/prod for the given MC eventype:

lb-run -c x86_64-slc6-gcc49-opt LHCbDirac/prod dirac-bookkeeping-decays-path 13104231

{% challenge "Challenge" %} Make a try with this command (do not forget to obtain valid Grid proxy).

  • Is the output clear enough? {% solution "Solution" %} The output is a list of record. Each record consists of
    1. The path in bookkeeping-DB
    2. DDDB-tag
    3. SIMCOND-tag
    4. Number of files
    5. Number of events
    6. Unique production ID, that coudl be used to get more detailed information
<script src="https://gist.github.com/VanyaBelyaev/8f057332459d03bd0ea040b05d124f53.js"/></script>

{% endchallenge %} 4. for Ganga/Grid there is a way to combine the function getBKInfo2/getBKInfo to obtain the information on flight from bookkeeping-DB and to propagate this information to Bender using params-argument of the configurefunction. This way is built around (3) {% discussion "In details,..." %}

template = JobTemplate( 
   application  = prepareBender (
    version      = 'v31r0'                 ,
    module       = my_module               ,
    use_tmp      = True                    ) ,
    ...
    ) 
productions = getBKInfo2 ( 13104231 )
for entry in productions :
    print 'INFORMATION: %s' % entry 
    path      = entry ['path'     ] ## "long path"
    dddbtag   = entry ['DDDBtag'  ] 
    conddbtag = entry ['CondDBtag']
    year      = entry ['Year']

    j = Job (  template ) 
    j.name      = ... ## construct name here
    j.inputdata = BKQuery ( path ).getDataset() 
    j.application.params = { 'DDBtag' : dddbtag , 'CondDBtag' : conddbtag , 'Year' : year } 
    j.submit() 

where it is assumed that configure-function is instrumented properly to accept params and to propagate the tags further to DaVinci-configurable. The function getBKInfo2 comes from here:

<script src="https://gist.github.com/VanyaBelyaev/6a6ddd1ff87757ab322b2d6e23b7ede0.js"></script>

{% enddiscussion %}

Easy, safe and robust alternative :-)

In practice, none of the step described above are really needed, since one can just instruct Bender to obtain the tags directly from the input files. In this recommended scenario, no DDDBtag/CondDBtags to be specified for DaVinci-configurable, but one needs to activate useDBtags=True flag for setData-function:

dv = DaVinci ( Simulation      = True                      , 
               ...
               ## DDDBtag         = 'dddb-20130929-1'         , ## NOT NEEDED
               ## CondDBtag       = 'sim-20130522-1-vc-mu100' , ## NOT NEEDED
               ...
               TupleFile       = 'MCtruth.root' )
...
setData  ( inputdata , catalogs , castor , useDBtags = True ) ## <--- HERE!

This is, probably, the most robust, safe and simultaneously the most convinient way to treat DDDB/SIMCOND-tags for your application :-)

The price to pay: since internally it relies on the functionality provided by get-dbtags-script, for processing it could take addtional O(1-2) minutes to open the first input file and to read DDDB/SIMCOND-tags from it.