Weights caching and column to_numpy to speedup histogram filling (!63) · Merge requests · cms-analysis / General / PocketCoffea

Davide Valsecchi requested to merge github/fork/valsdav/main into main Mar 07, 2023

The histogram filling has been speed up by caching the broadcasted weights, when the weights are event-level and the data_structure is 1-D.

Moreover profiling the filling I observed that the hist.fill() method is much faster then we pass numpy arrays instead of awkward arrays. I have included a to_numpy while caching the variables to fill.

This PR brings a ~30% speedup in the histogram filling phase, which with >10 categories can save quite a lot of time.

Weights caching and column to_numpy to speedup histogram filling

Merge request reports