Significantly speed up hit cleaning
I thought my jobs that output hits seemed a bit slow: they took several days whereas most stuff takes O(hours). Callgrind gave an interesting result:
essentially all the time was spent in the cleanHits
function. And indeed, that function has some trig operations that are called O(N_{\rm hits}^2)
times! I tried to mitigate this in two ways:
- Remove the trig calls from the inner loop, which reduces the number of calls to
atan2
toO(N_{\text{hits}})
- Group hits by layer, which reduces the leading order complexity for any calls to
O(N_{\text{hits in layer}}^2)
In a quick test this reduced the amount of time we spent in the track dumper from 6999.59 ms to 466.29 (in 9 events). That's 15 times faster! The track dumper is still about 10 times slower than the same configuration saving only jets, but I'd have to dig in more to know where the hotspots are now.
I diffed the h5 files for 10 events and saw no changes, so hopefully this is fine. Maybe @sargyrop and @svanstro want to take a look.