Sample normalization

Normally, events in data should not be weighted, and simulated events should receive a weight \frac{\sigma\mathcal L}{N\langle w\rangle} w_i, where w_i is the generator-level weight for given event i, \langle w\rangle is the mean generator-level weight before the event selection, N is the total number of events before the event selection. This weight should be assigned once, in the main event loop.

Currently, this normalization is applied to simulated samples in dataMCcomparison.C. Cross sections are read from samples.h. In the denominator the integral of histogram totEventInBaobab_tot is used. It is defined here (with duplications for the data-driven computations) and filled with the number of in-time pileup interactions.

In simulation this histogram is filled with weights divided the fraction of events selected for bonzais. Its integral gives the sum of generator-level weights before the event selection, thus reproducing the correct normalization. (In principle, the selected fraction of events is computed with files processed in a given job and not from the full data set, but difference between the two fractions should be very small.)

In data the histogram totEventInBaobab_tot is also filled with weights equal to the inverse of the fraction of events selected for bonzais. For other histograms the weights are set to 1. The histogram totEventInBaobab_tot is never used, except for printing the total number of events here (which is not useful in any way).

The above rescaling of simulated samples is repeated over and over again in various places in the repository (search for ‘instLumi’ (sic!) or ‘totEventInBaobab’).

Edited May 02, 2019 by Andrey Popov