Skip to content

Introduced compressed LArDigitContainer_p2

Walter Lampl requested to merge wlampl/athena:LArDigitCompression into master

This MR introduces a new persistent version of LArDigitContainer, designed to save disk-space used by the premixed RDOs used in the overlay workflow.

The new LArDigitContainer_p2 contains:

  • A vector<char> of fixed size hashMax/4 (195072/4=48768 bytes) where 2 bits are used per readout channel. These are used to tell if the digits of a partiuclar readout channel stored and if yes, the readout gain.
  • A vector<short> storing the actual ADC values, ordered by online-hash, omitting the digits of absent channels
  • The number of samples is stored only once per container and assumed to be identical for all readout channels.

Compared to the _p1 version, we do not store the identifiers any more (determined by the index in the vector), we do not store the number of samples per channel any more and we use much fewer bits to store the gain. Most of these where efficiently compressed anyway so the disk-size improvement is less important than one may think.

The size of the LArDigitContainer in the the premixed RDO goes down from 638.989 kb/event to 536.887 kb/event.

This improvement comes at the some cost:

  • Code-complexity (slightly awkward bit-manipulations)
  • The size of the thinned digit container in the ESD actually increases because of the fixed-size hash-indexed vector. But given that ESDs are practically dead, this doesn't matter that much.

Tagging @hgray and @jchapman.

Edited by Walter Lampl

Merge request reports