|  | 
| 1 |  | -In 2022, LHCb has released the first 200 terabytes of the data via CERN OpenData Portal, making it available to the public. | 
|  | 1 | +By the end of 2023, LHCb released all of its Run I data, via CERN Open Data Portal, to the general public. The data comes in `.DST` and `.MDST` format which is the same format used by LHCb internally.   | 
| 2 | 2 | 
 | 
| 3 |  | -- The data comes in `.DST` and `.MDST` format, the same format is used by LHCb internally. | 
| 4 | 3 | - Every data set released is narrated by an "Open Data Record" accessible through Open Data Portal. | 
| 5 |  | -- Open Data Records contain various bits of information about the selected data set (this is called metadata). An example of the types of metadata provided in the record is: | 
| 6 |  | -  - Number of events in the dataset | 
| 7 |  | -  - Number of files in the dataset | 
| 8 |  | -  - Combined size in TB of the dataset | 
| 9 |  | -  - Production ID | 
| 10 |  | -  - Production Type | 
| 11 |  | -  - Detector conditions (condb, dddb tags) | 
| 12 |  | -  - List of Trigger Configuration Keys (TCKs) | 
| 13 |  | -  - Scripts used for each production step | 
| 14 |  | -  - List of Logical File Names (LFNs) on [LHCb DIRAC](https://lhcb-dirac.readthedocs.io/en/latest/). | 
| 15 |  | - | 
| 16 |  | -The metadata provided should help the user to navigate, select and work with with LHCb Open Data. | 
| 17 |  | - | 
| 18 |  | -Index of files is accessible both via a GUI or as a machine readable file. | 
| 19 |  | - | 
| 20 |  | -- Some instructions on how to use open data are pointed out in the records themselves. | 
| 21 |  | -- As well as the data records, an extensive list of LHCb stripping lines and their descriptions is provided as well. | 
| 22 |  | -- After selecting the desired stream, a stripping line description can be followed to obtain a number of cuts/conditions which could be used to filter the data further. | 
| 23 |  | -- Data can be accessed directly (eg. using [xrootd](https://xrootd.slac.stanford.edu/) protocol) or downloaded locally. | 
| 24 |  | -- It is suggested to further filter and categorize the data by writing out smaller data files in `.root` format (called ntuples). | 
| 25 |  | -- This is done in LHCb with the help of software called [DaVinci](https://lhcbdoc.web.cern.ch/lhcbdoc/davinci/). | 
| 26 |  | -- 'DaVinci' and other LHCb Software is available through [CVMFS](https://cernvm.cern.ch/fs/). | 
| 27 |  | -- Some initial instructions on working with DaVinci are provided in [LHCb Starterkit](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/minimal-dv-job.html) web page. | 
|  | 4 | +- Open Data Records contain various bits of information about the selected data set (this is called metadata). An example of the types of metadata provided in the record is:   | 
|  | 5 | +    - Number of events in the dataset | 
|  | 6 | +    - Number of files in the dataset | 
|  | 7 | +    - Combined size in TB of the dataset | 
|  | 8 | +    - Production ID | 
|  | 9 | +    - Production Type | 
|  | 10 | +    - Detector conditions (condb, dddb tags) | 
|  | 11 | +    - List of Trigger Configuration Keys (TCKs) | 
|  | 12 | +    - Scripts used for each production step | 
|  | 13 | +    - List of Logical File Names (LFNs) on [LHCb DIRAC](https://lhcb-dirac.readthedocs.io/en/latest/). | 
|  | 14 | + | 
|  | 15 | +The [LHCb Open Data Guide](https://lhcb-opendata-guide.web.cern.ch/) and the metadata provided on the Open Data Portal should help the user to navigate, select, and work with LHCb Open Data.   | 
0 commit comments