Output Files#
The TDD outputs Hierarchical Data Format (HDF) files which have a .h5
extension.
Each output file can contain multiple datasets, which store data in fixed sized arrays.
For example, the jets
dataset stores for each dumped jet a 1-d array of variables about the jet (e.g. pt
, eta
, etc).
Meanwhile tracks
dataset contains for each jet a 2-d array, with the first index selecting different tracks in the jet (up to 40 tracks are stored by default), and the second index selecting a different track variable (e.g. numberOfPixelHits
, etc).
h5 datasets must have a fixed shape, so jets with fewer than 40 tracks are padded with null tracks, where the default values of track variables are used. The different datasets and variables present in your output files depends on the configuration of the jobs as discussed elsewhere.
Useful Tools#
There are a few commonly used tools for working with the output .h5
files.
Tool | Description |
---|---|
Umami | The main Python framework used for further processing of output h5 files and algorithm training |
Puma | The main FTAG plotting framework, also used by umami |
h5ls |
Lists the contents of an h5 file (as mentioned in the installation section) |
h5diff |
Highlights differences between two h5 files - useful for validating a new output against some reference |
h5py |
A Python package for working with h5 files, used by downstream packages like umami |
tdd-scripts |
A few Python functions for reading the tdd h5 outputs |
h5-batched-read |
Python functions for reading in batches from h5 files at full precision |
Default Values#
Jet level defaults are specified as arguments to add_btag_fillers()
calls in the BTagJetWriterUtils.cxx
.
Variable type | Default Value |
---|---|
Char | -1 |
Int | -1 |
Float | NaN |
Track level defaults are specified as arguments to add_track_fillers()
calls in the BTagTrackWriter.cxx
. To select non-padded tracks, the valid
flag can be used, which is True
when the track is present.
Variable type | Default Value |
---|---|
Unsigned Char | 0 |
Int | -1 |
Float | NaN |
Bool | False |