Fix out of bounds read in CalculateWindows.cu
I got the following error from cuda-memcheck:
Failed to run cudaMemcpyAsync( host_buffers.host_number_of_reconstructed_ut_tracks, arguments.offset<dev_atomics_ut>() + host_buffers.host_number_of_selected_events[0]*2, sizeof(uint), cudaMemcpyDeviceToHost, cuda_stream )
unspecified launch failure
terminate called after throwing an instance of 'std::invalid_argument'
what(): cudaCheck failed
========= CUDA-MEMCHECK
========= Invalid __global__ read of size 4
========= at 0x00000638 in find_candidates_in_sector_group(UT::Hits const &, UT::HitOffsets const &, MiniState const &, float const *, float, float, float, float, float, float, int)
========= by thread (27,3,0) in block (9,0,0)
========= Address 0x7feb73c01150 is out of bounds
========= Device Frame:calculate_windows(int, int, MiniState const &, float const *, UT::Hits const &, UT::HitOffsets const &, float const *, float const *, unsigned int const *, Velo::Consolidated::Tracks const &) (calculate_windows(int, int, MiniState const &, float const *, UT::Hits const &, UT::HitOffsets const &, float const *, float const *, unsigned int const *, Velo::Consolidated::Tracks const &) : 0x2a00)
========= Device Frame:ut_search_windows(unsigned int*, unsigned int const *, int*, unsigned int*, char*, PrUTMagnetTool*, float const *, unsigned int const *, float const *, int*, bool*) (ut_search_windows(unsigned int*, unsigned int const *, int*, unsigned int*, char*, PrUTMagnetTool*, float const *, unsigned int const *, float const *, int*, bool*) : 0x13c0)
This is fixed with this MR. The efficiencies with and without this fix are the same, the hit purity is 0.01% higher in two cases, the logs are attached: no_fix.logfix.log
Edited by Roel Aaij