{% objectives "Learning objectives" %}
- Understand the overall structure of Bender module and the configuration of the application {% endobjectives %}
{% objectives "Learning objectives" %}
- Understand the overall structure of Bender module using the oversimplified example {% endobjectives %}
Any valid Bender module must have two essential parts
- function
runwith the predefined signature - function
configurewith the predefined signature
For the most trivial ("do-nothing") scenario function run is
def run ( nEvents ) :
# some fictive event loop
for i in range( 0 , min( nEvents , 10 ) ) : print ' I run event %i ' % i
return 0In a similar way, the simplest "do-nothing"-version of configure-function is
def configure ( datafiles , catalogs = [] , castor = False , params = {} ) :
print 'I am configuration step!'
return 0As one clearly sees, these lines do nothing useful, but they are perfectly enough to be classified as the first Bender code. Moreover, the python module with these two function can already be submitted to Ganga/Grid, and Ganga will classify it as valid Bender code. Therefore this code is already "ready-for-Ganga/Grid"! {% discussion "The details for the curious students: how Ganga/Grid treat Bender modules?" %} Actually Ganga executes at the remote node the following wrapper code
files = ... ## this one comes from DIRAC
catalogs = ... ## ditto
params = ... ## extra parameters (if needed): this comes from the user
nevents = ... ## it comes from Ganga configuration
import USERMODULE ## here it imports your module!
USERMODULE.configure ( files , catalogs , params = params )
USERMODULE.run ( nevents )Thats all! From this snippet you see:
- the code must have the structure of python module, namely no executable lines should appear in the main body of the file
- (note the difference with respect to the script)
- it must have two functions
runandconfigure- (everything else is not used) {% enddiscussion %} The whole module is here:
In practice, before the submission the jobs to Ganga/Grid, the code needs to be tested using some test-data.
This, formally unnecessary, but very important step can be easily embedded into your module using
python's __main__ clause:
if '__main__' == __name__ :
print 'This runs only if module is used as the script! '
configure ( [] , catalogs = [] , params = {} )
run ( 10 ) Note that these lines effectively convert the module into script, and finally one gets:
<script src="https://gist.github.com/VanyaBelyaev/4d96dbfa8e94379b284ec7364365dde6.js"></script>The answer is trivial:
lb-run Bender/prod python DoNothing.pyThat's all. Make a try and see what you get!
{% discussion "Unnecessary but very useful decorations:" %} It is highly desirable and recommended to put some "decorations" a top of this minimalistic lines:
- add magic
#!/usr/bin/env pythonline as the top line of the module/script - make the script executable:
chmod +x ./DoNothing.py - add a python documentation close to the begin of the script
- fill some useful python attributes with the proper information
__author____date____version__
- do not forget to add documentation in Doxygen-style and use in comments following tags
@file@author- ...
- fill some useful python attributes with the proper information
With all these decorations the complete module is here {% enddiscussion %}
For all subsequent lessons we'll gradually extend this script
with the additional functionality, step-by-step converting
it to something much more useful.
{% discussion "In practice, ..." %}
In practice, the prepared and ready-to-use function run is imported from some of the main Bender module Bender.Main,
and the only one really important task for the user is to code the function configure.
{% enddiscussion %}
{% objectives "Learning objectives" %}
- Understand the internal structure of the configure function {% endobjectives %}
For the typical case in practice, the function configure (as the name suggests) contains three parts
- static configuration: the configuration of
DaVinciconfigurable (almost unavoidable) - input data and application manager: define the input data and instantiate Gaudi's application manager (mandatory)
- dynamic configuration: the configuration of
GaudiPythoncomponents (optional)
For the first part, the instantiation of DaVinci configurable is almost unavoidable step:
from Configurables import DaVinci
rootInTES = '/Event/PSIX'
dv = DaVinci ( DataType = '2012' ,
InputType = 'MDST' ,
RootInTES = rootInTES )Here we are preparing application to read PSIX.MDST - uDST with few useful selections for B&Q Working Group.
Note that in this part one can use all power of DaVinci/Gaudi Congifurables.
In practice, for physics analyzes, it is very convenient to use here Selection framework,
that allows to configure DaVinci in a very compact, safe, robust and nicely readable way,
e.g. let's get from Transient Store some selection and print its content
from PhysConf.Selections import AutomaticData, PrintSelection
particles = AutomaticData ( 'Phys/SelPsi2KForPsiX/Particles' )
particle = PrintSelection ( particles ) As the last sub-step of (1), one needs to pass the selection object to DaVinci
dv.UserAlgorithms.append ( particles ){% discussion "Where is SelectionSequence ?" %}
The underlying SelectionSequence object will be created automatically.
You should not worry about it.
{% enddiscussion %}
This part is rather trivial and almost always standard:
from Bender.Main import setData, appMgr
## define input data
setData ( inputdata , catalogs , castor )
## instantiate the application manager
gaudi = appMgr() ## NOTE THIS LINE! while setData can appear anywhere inside configure function,
the line with appMgr() is very special. After this line,
no static configuration can be used anymore. Therefore all the code dealing with
Configurables and Selections must be placed above this line.
For this particular example, it is not used, but will be discussed further in conjunction with other lessons.
The complete configure function is:
The prepared and ready-to-use function run is imported Bender.Main:
from Bender.Main import run Now our Bender module (well, it is actually pure DaVinci, no real Bender here!) is ready to be used with Ganga/Grid.
For local interactive tests we can use the trick with __main__ clause:
The __main__ clause in our case contains some input data for local tests:
if __name__ == '__main__' :
inputdata = [
'/lhcb/LHCb/Collision12/PSIX.MDST/00035290/0000/00035290_00000221_1.psix.mdst' ,
'/lhcb/LHCb/Collision12/PSIX.MDST/00035290/0000/00035290_00000282_1.psix.mdst' ]
configure( inputdata , castor = True )
## the event loop
run(10000)The complete moodule can be accessed here
Again, the answer is trivial (and universal):
lb-run Bender/prod python DoNothing.pyThat's all. Make a try and see what you get!
{% challenge "Challenge" %}
Try to convert any of your existing DaVinci simple script into Bender module and run it interactively.
You can use the result of this excersize for subsequent lessons.
{% endchallenge %}
{% discussion "What is castor ? Why LFN is used as input file name?" %}
Bender is smart enough, and for many cases it can efficiently convert input LFN into
the real file name.
- First, if you have Grid proxy enabled (
lhcb-proxy-init) is uses internallyLHCbDiracto locate and access the file. This way is not very fast, but for all practical cases this look-up is almost always successful, however for some cases certain hints could be very useful. In particular, you can specify the list of Grid sites to look for data files:
## define input data
setData ( inputdata , catalogs , castor = castor , grid = ['RAL','CERN','GRIDKA'] )- Second, for CERN, one can use option
castor = True, that activates the local look-up on input files at CERN-CASTOR and CERN-EOS storage (root://castorlhcb.cern.chandroot://eoslhcb.cern.ch). This look-up is much faster than the first option, but here the success is not guaranteed, since not all files have their replicas at CERN. - Third, for access to special locations, e.g. some local files, Bender also makes a try to look into
directories specified via the environment variable
BENDERDATAPATH(column separated list of paths) and also try to contruct the file names using the content of environment variableBENDERDATAPREFIX(semicolon separated list of prefixes used for construction the final file name). The file name is constructed using all(n+1)*(m+1)variants, wherenis number of items inBENDERDATAPATHandmis number of items inBENDERDATAPREFIX. Using the combination ofBENDERDATAPATHandBENDERDATAPREFIXvariables one can make very powerful matching of short file names (e.g. LFN) to the actual file. Using these variables one can easily perform a local and efficient access to Grid files from some close Tier-1/2 center.
{% enddiscussion %}
{% keypoints "Keypoints" %} With these two examples, you should aready be able to
- code the valid (but useless) Bender modules
- run them interactively {% endkeypoints %}