-
Notifications
You must be signed in to change notification settings - Fork 7
Running the correction
Always first enter AliRoot, and then the virtual environment.
To run the correction software, inside the tpcwithdnn/ directory type:
python steer_analysis.pyIf you see some warnings like:
cling::DynamicLibraryManager::loadLibrary(): libGpad.so.6.16: cannot open shared object file: No such file or directory
they can be safely ignored.
Full correction consists of 3 stages:
- prediction of global distortion fluctuations with Boosted Decision Trees (XGBoost library)
- correction of the distortion fluctuations predicted by BDT
- prediction of residual (local) distortion fluctuations with U-Net - a deep convolutional neural network
Specific stages can be (de-)activated with active flag under each category in config_model_parameters.yml.
NOTE: What is called 'event' in the code and in the instruction is not a single collision, but a 'snapshot' - a set of measurements (3D maps) for a given time point. Such a single measurement reflects actually the overlapping of many collisions.
You can define which steps of the an analysis you want to run in default.yml. Multiple steps can be active:
-
dotrain- train model from zero -
doapply- use the trained model to make predictions -
doplot- create quality assurance plots of prediction results -
dobayes- perform Bayesian optimization to find the best model configuration - currently implemented only for BDT -
doprofile- compare models trained with different numbers of events
The remaining options are for the ND Validation.
The parameters of the ML analysis can be configured in config_model_parameters.yml. Most often used arguments:
-
dirmodel,dirapply,dirplots- directories where the trained models, prediction results and plots should be saved (the paths can be relative) -
dirinput_bias,dirinput_nobias- paths to directories where the biased / unbiased input datasets are stored -
grid_phi,grid_r,grid_z- grid granularity, usually 90x17x17 or 180x33x33 -
z_range- only distortions with z_min <= z < z_max will be processed by the algorithm -
opt_predout- the direction of distortions (r, rphi, z) to correct - currently only one direction can be processed at time -
train_events,validation_events,apply_events- number of events for train / validation / apply, specified separately for BDT and NN. You can specify multiple numbers, but the lists of values for train / validation / apply should be of equal length. Then, the program will run for each triple. Ifdoprofileis specified, the program will output plots with prediction results (mean, std dev., mean + std dev.) gathered for each triple.
Currently, random forest (XGBRFRegressor) is used. The default configuration uses the approximate 'hist' tree method, the fastest available in XGBoost.
-
downsample- whether to use the downsampling -
downsample_npoints- number of voxels to downsample -
plot_train,train_npoints- whether to plot the learning curve and with how many points
The remaining parameters, under the params section come from the XGBoost Scikit-Learn API. Their meaning is described on the XGBoost page.
-
filters- number of channels (filters) in the first convolutional block (the 3rd dimension of a 3D convolution) -
pooling- type of pooling function: max - max pooling, avg - average pooling, conv - not an actual pooling but 3D convolution -
depth- depth of the network = number of convolutional blocks / levels -
batch normalization- whether to use batch normalization -
dropout- fraction of dropout -
batch_size- size of a batch -
shuffle- whether to shuffle -
epochs- number of epochs -
lossfun- loss function -
metrics- metrics, values measured besides the loss function that do not affect training -
adamlr- learning rate for Adam optimizer
ND validation parameters are explained in Validation.
Some parameters are available for a quick setup on the command line. You can check them with:
python steer_analysis.py -h- debug: on the console
- models:
dirmodel- XGBoost: JSON
- U-Net: JSON, network weights: h5
- predictions:
dirval- a single ROOT file with histograms
- prediction and profile plots:
dirplot - indices of events in train / validation / apply partitions:
dirmodel- the indices are picked up by ND validator if any of these partitions is chosen for the ND validation
-
dirmodel,dirval,dirplotare taken fromconfig_model_parameters.yml