Skip to content

Validation

ehellbar edited this page Jul 9, 2021 · 16 revisions

Creating the validation data, histograms and pdf maps

Set appropriate options in default.yml and config_model_parameters.yml.

In default.yml:

  • docreatevaldata: whether to produce a root file with full validation data described on the Validation Data Format page; the result file is needed for other options below to work
  • docreatepdfmaps: whether to create ND histograms (as *.gzip files) and pdf maps (*.root) for all validation data (currently: mean maps with id 0, 9, 18)
  • docreatepdfmapforvariable: whether to create ND histograms and pdf maps for the data specified in config_model_parameters.yml
  • domergepdfmaps: whether to merge pdf maps for different mean maps and factors into one file

In config_model_parameters.yml:

  • diroutflattree: where to save validation data and pdf maps
  • dirouthistograms: where to save validation histograms
  • validate_model: if the trained model (its predictions) should be evaluated as well - this and a trained model is needed to produce ND histograms and maps
  • pdf_map_var, pdf_map_mean_id: ND histograms and pdf maps will be created for the data with this map factor and mean map id when docreatepdfmapforvariable is set in default.yml

Plotting and browsing pdf maps and validation trees

The easiest way to examine the result files is to use the interactive Jupyter notebook available here.

Alternatively, one can manually draw plots with ROOT, from *.root pdf maps.

Running the notebook in aliceml via ssh

Enter the notebooks directory:

cd notebooks/

Launch Jupyter without browser (you will later browse on your local machine). It will print an URL with a token, copy and store this for the next step.

python -m notebook --no-browser --port=8887 # Or any other reasonable port number

On your local machine, tunnel the localhost to the notebook port:

ssh -N -L localhost:8888:localhost:<aliceml_port_number> <your_name>@aliceml

You can browse the notebooks at the URL returned to you by Jupyter, just change the port number to 8888.

Examining pdf maps and validation trees with Jupyter

NOTE: This is not yet merged, you need to pull the code from this branch.

First, you need to prepare a list of pdf maps / validation trees to be contained in the notebook.

For pdf maps, adjust the makePDFMapsList() function in the script notebooks/makePDFMapsLists.sh. Then, run:

source makePDFMapsLists.sh
makePDFMapsList

You should get a new file pdfmaps.list with paths to proper pdf files.

In case of input validation, adjust makeValTreesList() function in the same script and run:

source makePDFMapsLists.sh
makeValTreesList

Once you have Jupyter running according to the instructions above, you should see the notebooks/ directory contents in the browser. For pdf maps / model validation enter model_performance_evaluation.ipynb. For input data validation enter input_data_validation.ipynb.

You need to adjust the directory variables on the top. Then, follow the rest of code in the notebook, adjusting any file paths as needed. You might need to adjust the cuts if there is no matching data.

CAUTION: Validation input can be very big - better keep the default selection and do not run the whole notebook at once but only the plots of interest.

Clone this wiki locally