Skip to content

Adding CI Tests

If you're developing an application that needs a specific configuration file, input format, or top-level ComponentAccumulator-based script, we encourage you to write continuous integration tests so that we won't break your workflow.

The main test script lives in test/test-dumper, and is invoked by running test-dumper <mode>, where <mode> is a key that selects the configuration file, the input data file, and the top level test executable to run the dumper. To add a new mode it should be sufficient to edit test-dumper and add a new entry under CONFIGS, DATAFILES, and TESTS.

To add the tests to CI, you'll also have to edit .gitlab-ci.yml

Making inputs for tests#

We keep the input files small (roughly 10 MB) so that tests are quick to run locally. If you need a new input format for your test, you can create a smaller file in several ways.

A few scripts to make test files live in the test file repository. Use these to make a test file and then skip to the uploading test files section below.

Please run the test you want to use this file for locally before adding it to the dumper-test-files repository. To test locally, you can run the script and the command you want to use this file for. For example, if you want to run the dump-single-btag command with the GN3_dev.json config file, do the following:

dump-single-btag -c GN3_dev.json DAOD_FTAG1.601589.e8547_s3797_r13144_p6859.small.pool.root

If this succeeds, everything is fine! But sometimes, due to some broken metadata in the generally available test files on cvmfs, it can happen that the easy way doesn't work. You can spot this when you get the following error message when running the above command:

IncidentSvc ERROR Could not find AllExecutedEvents CutBookkeeper information.
terminate called after throwing an instance of 'std::exception'
  what():  std::exception
 *** Break *** abort

followed by a huge c++ error.

Is there some way to test metadata without running the dumper?

You can test the metadata in your output file with ftag-test-metadata which is a utility that should be set up with every Athena or AthAnalysis release. If this utility works, you might have a working file.

If this is the case, first make sure you ran the make-daod script to generate the DAOD directly from an AOD. If this fails, you can try to do the same thing manually as described below.

Doing it manually (the alternative)#

Are you sure you don't want to do this the easy way?

The scripts in the test file repository should be all you need, and will take care of getting the naming conventions correct. If you're having trouble running them please file an issue there.

Sometimes you can download a dataset from the grid, set up Athena, and run

Merge_tf.py --CA --inputAODFile <xAOD-file> --outputAOD_MRGFile <name>.small.pool.root --maxEvents 10

Unfortunately this may not work for some DAODs. In this case, see the section on making test DAODs.

Making DAODs from AODs (the last way out)#

You can reproduce a DAOD from its source, but unfortunately there are a few steps which you have to follow carefully. Suppose you'd like a 10 event file of mc20_13TeV.601589.PhPy8EG_A14_ttbar_hdamp258p75_nonallhadron.deriv.DAOD_FTAG1.e8547_s3797_r13144_p6859. You'll need to find:

  1. The input AOD file, and
  2. The command to produce the DAOD from the AOD

And then use this information to build a new file.

Central derivation documentation page

The main documentation page on running derivations can be found here.

Finding the input AOD with rucio#

For most of our centrally produced DAODs (FTAG1, FTAG2, FTAGXBB, etc.), you can find the original AOD on the List of DAOD Samples in the FTAG Docs.

If you can't find it there, you need to do some detective work using rucio. Running

setupATLAS
lsetup rucio

will set up rucio. You can then search with rucio ls, using the wildcard * to fill in unknown parts of the dataset:

rucio list-dids mc20_13TeV.601589*recon.AOD*e8547_s3797_r13144*

Important here is to use the recon.AOD. This is the prefered AOD container for producing DAODs. Adding the e-, s-, and r-tags allows you to get the correct container.

After running the command, you will sometimes see multiple containers with the same name but multiple e-, s-, and r-tags. These are intermediate outputs of the grid and should NOT be used if a version of the container with only one e-, s-, and r-tag is available! Furthermore, there are sometimes single datasets stored with a similar name, but ending with _tid and a number. Again, these are intermediate outputs and should NOT be used if a container is available.

For our example at the top, you will get an output that looks like this:

+------------------------------------------------------------------------------------------------------+--------------+
| SCOPE:NAME                                                                                           | [DID TYPE]   |
|------------------------------------------------------------------------------------------------------+--------------|
| mc20_13TeV:mc20_13TeV.601589.PhPy8EG_A14_ttbar_hdamp258p75_nonallhadron.recon.AOD.e8547_s3797_r13144 | CONTAINER    |
+------------------------------------------------------------------------------------------------------+--------------+

Great! We found our input AOD container! Now, the last check you need to do is to check if the container is still available on the grid. You can do this by running

rucio list-files mc20_13TeV:mc20_13TeV.601589.PhPy8EG_A14_ttbar_hdamp258p75_nonallhadron.recon.AOD.e8547_s3797_r13144

You will either get an empty list or a huge list with all the files stored in this container. The files lines will look like this if the container is filled and available:

Total files : 2499
Total size : 18.230 TB
Total events : 49983000

Now, to create a simple test DAOD file, we just need one random file and not the whole container. You can get one random file simply with:

rucio download mc20_13TeV:mc20_13TeV.601589.PhPy8EG_A14_ttbar_hdamp258p75_nonallhadron.recon.AOD.e8547_s3797_r13144 --nrandom 1

Finding the command to reproduce the derivation#

It's important that you set up the same release and produce the derivation with the same parameters that it was initially built with. You can look up the p6859 tag with AMI. This will show you that the cacheName was 25.0.33.

You'll need to open a clean shell and set up this release

setupATLAS
asetup Athena,25.0.33

and then produce a new DAOD with

Derivation_tf.py \
--CA True \
--AMIConfig p6859 \
--formats FTAG1 \
--maxEvents 10
--inputAODFile <name-of-your-AOD-file> \
--outputDAODFile 601589.e8547_s3797_r13144_p6859.small.pool.root \

There are several additional arguments: - --CA True tells Derivation_tf.py to use the ComponentAccumulator config. - --AMIConfig p6859 configures the job to use the exact p-tag of choice for the output DAOD. - --formats FTAG1 tells Derivation_tf.py which derivation format will be used. - --maxEvents 10 tells the job to stop after 10 events. - --inputAODFile is the path to your downloaded AOD file from before. - --outputDAODFile specifies the output type as DAOD and gives a file extension. As convention for the dumper test files, please name the output file after the following convention:

This should produce a file called DAOD_FTAG1.601589.e8547_s3797_r13144_p6859.small.pool.root. You can check the contents with

checkFile DAOD_FTAG1.601589.e8547_s3797_r13144_p6859.small.pool.root

You can cross check this with the officially produced samples to make sure they contain the same information.

Uploading test files#

Test files are stored in a dedicated repository via Git LFS. After you've confirmed that you can use your reduced test file, create a MR to this repository including your file. Be sure to commit your file using Git LFS.

Using Git LFS

Using Git LFS requires LFS to be installed. Several installation options are listed here. If you use conda you can also run simply run conda install git-lfs.

However you install Git LFS, you need to run the setup command once per user account.

setupATLAS
lsetup git
git lfs install

Once run, Git will automatically commit any files named *.root using LFS.

You can check if you successfully addes the file with LFS by searching for your file with

git lfs ls-files