Skip to content

Installation#

AthAnalysis Installation#

This section describes how to install the AthAnalysis-based dataset dumper. AthAnalysis is a lightweight "analysis" framework based on Gaudi. Setting up with this is the recommend workflow in most cases. You can also set up with Athena as described below.

The training dataset dumper can be installed either locally or be run by using a Docker image. Both options are outlined below.

Local installation#

First, retrieve the project by cloning the git repository. If you plan to make any changes to the repository, you should instead fork the repository and clone your fork. You can find out more in the contributing guidelines.

git clone ssh://git@gitlab.cern.ch:7999/atlas-flavor-tagging-tools/training-dataset-dumper.git

Then, install the project locally by setting up a release and compiling the code.

source training-dataset-dumper/setup/athanalysis.sh
mkdir build
cd build
cmake ../training-dataset-dumper
make
cd ..
!!!ERROR!!! No matched release is found

If you see this error alongisde the message

Please run a el9 container (eg setupATLAS -c el9) and run the asetup command inside to use the above release

then you are using an unsupported linux distribution. To fix this, you can setup a container by first running setupATLAS -c el9, and then re-running the setup script.

```bash

What to do if the setup script crashes

If a setup script (e.g. setup/athanalysis.sh) crashes, you should first report this by opening an issue on GitLab. The script may have crashed because you used an unsupported shell. The main supported shell is bash, though there is also experimental support for zsh. If you are using a different shell and experience problems, either open a bash shell for working with the dumper, or set up an analysis release by hand.

To do the latter, first find the recommended analysis release version in the .gitlab-ci.yml file, in the line with BUILD_IMAGE: $DOCKER_CACHE/atlas/athanalysis:{analysis_release_version}. You can then manually set up a release by entering

# use the analysis release version which is documented in
# the .gitlab-ci.yml file and not the dummy example below
ANALYSIS_RELEASE=22.2.95
setupATLAS
asetup AthAnalysis,${ANALYSIS_RELEASE}

As the final step, the following file needs to be sourced in order to add the compiled executables to the system path

source build/x*/setup.sh

For convenience, all the above commands (aside from setting up a release) are packaged into a build.sh script, which you can run after setting up a release with with

source ./training-dataset-dumper/setup/build.sh

This script will set up a fresh build/ directory in the directory above training-dataset-dumper/. Similarly, rebuild.sh can be used to rebuild the code without setting up a fresh directory (assuming you have already built the code at some point in the past).

Running a test#

The package includes a test script, which will download and process a small test sample. To run a test, use

test-dumper ca
Fixing dbm.error - be sure to run the installation in a clean shell if you have conda installed.

It is advised to run the installation in a fully clean shell with no environment active or PYTHONPATH set. In the case of an active conda environment this can be achieved with

conda deactivate
unset PYTHONPATH

If you don't do this, you may see the following error:

dbm.error: db type is dbm.gnu, but the module is not available

Note, the setup scripts will attempt to deactivate conda for you.

This script will run the program in FTagDumper/src/JetDumperAlg.cxx which will dump some xAOD information to HDF5. The script takes a mandatory argument which specifies which configuration file to use for the test job. By default the output from test-dumper will be stored in a random directory under /tmp, but this can be configured, see

test-dumper -h

for options. You can inspect the contents of this file with

h5ls path/to/output.h5

again see -h for more options. Also see the h5ls tab-complete script for bash users.

Issues when building or running the code

The first thing to try is completely removing your build/ directory and setting everything up from scratch in a fresh shell.

Restoring the setup#

The next time you want to use the utility run from the project directory

source training-dataset-dumper/setup/athanalysis.sh
source build/x*/setup.sh

Docker containers#

You can run the training dataset dumper in a Docker container. This is a convenient way to run the code if you don't have access to /cvmfs/.

Complete images are created automatically from the main branch and updated for every modification using Continuous Integration. Note, that you need to specify the tag main to run the training-dataset-dumper for release 22:

gitlab-registry.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper:main

Developing in containers The docker image contains a static version of the dataset dumper code. If you want to actively develop the code, the recommended way to do so is to check out a local version of the project (but not install it), then start up a docker container which has the local directory mounted. This provides you with an ATLAS release which contains the dependencies the code needs to build and run.

Example:

# get code
git clone ssh://git@gitlab.cern.ch:7999/atlas-flavor-tagging-tools/training-dataset-dumper.git
# start docker container and mount current directory inside container
docker run --rm -it -v $PWD:/home/workdir --workdir /home/workdir gitlab-registry.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper:main
# compile code: no need to source a setup script with "asetup" inside of a docker container
mkdir build
cd build
cmake ../training-dataset-dumper
make
cd ..
# add executables to system path
source build/x*/setup.sh

You aren't required to build the dataset dumper in the above image: any relatively recent AthAnalysis image in release 22 will accomplish the same thing.

Launching containers using Docker (local machine) If you work on a local machine with Docker installed, you can run Umami with this command:

docker run --rm -it gitlab-registry.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper:main

You can mount local directories with the -v argument:

docker run --rm -it -v $PWD:/home/workdir --workdir /home/workdir gitlab-registry.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper:main

Launching containers using Singularity (lxplus/institute cluster) If you work on a node of your institute's computing centre or on CERN's lxplus, you don't have access to Docker. Instead, you can use singularity, which provides similar features.

You can run the training dataset dumper in singularity with the following command:

singularity --silent run docker://gitlab-registry.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper:main

You can mount local directories with the -B argument:

singularity --silent run -B /cvmfs:/cvmfs -B /afs:/afs -B $PWD:/home/workdir docker://gitlab-registry.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper:main

Athena Installation#

Some advanced functions of this package require you to set up with full Athena, the full reconstruction framework used within ATLAS. The lighter AthAnalysis framework does not support lower level reconstruction. For an exact list of which packages are included in every base project, see the package_filters.txt files under the Projects directory in Athana.

Make sure to set up a completely fresh environment when changing releases!

You should start with a fresh shell, and completely delete the build/ directory,

To install with Athena, just use an Athena based setup script. So instead of

source training-dataset-dumper/setup/athanalysis.sh

you would instead use

source training-dataset-dumper/setup/athena.sh

And then follow the rest of the instructions above.

Docker images are not available for Athena releases.

Working with nightlies#

If you're on the bleeding edge of ATLAS software, there may not be a stable release that supports a new feature you need. The scripts setup/athena-latest.sh and setup/analysisbase-latest.sh will set up the latest nightly build.

Working with nightlies comes with a few caveats:

  • We make no promises that the code will work with them! Our continuous integration tests all updates in the tagged releases for both AnalysisBase and AthAnalysis based projects. This is not extended to nightlies.
  • Nightlies will disappear after a few weeks by default. Because of this, producing larger datasets based on a nightly is strongly discoursged: no one will be able to reproduce your work when the nightly is deleted. If you have to do this, you should open a nightly preservation request, for example as in ATLINFR-4697.