Optimise fragment offset calculation for GPU batches
Also add an application that benchmarks fragment offset calculation
The speed of the mep_offsets
function improves from 14.3 Hz to 648 Hz (45x speedup). 4 threads are now more than enough to calculate the offsets; 2 would probably still be fine. It seems that EB::get_padding( s, 1 << align )
is very slow.
requires Allen!1127 (merged)
Merge request reports
Activity
mentioned in merge request Allen!1127 (merged)
mentioned in issue Allen#393 (closed)
added RTA label
- Resolved by Rosen Matev
Hi Roel, I pulled this branch and the one line change to Allen for the throughput tests and I am seeing an intermittent crash. I tried running gdb on a CPU dbg build, but didn't see it that time (unclear if this means it won't happen with that build or just didn't happen when I tried). I can observe the crash and then immediately run the same command and not have it. crash.txt
- Resolved by Patrick Spradlin
/ci-test --merge Allen!1127 (merged)
added ci-test-triggered label
- [2023-02-23 14:06] Validation started with lhcb-master-mr#7120
- [2023-02-28 19:01] Validation started with lhcb-master-mr#7183
Edited by Software for LHCbmentioned in issue Moore#530 (closed)
- Resolved by Patrick Spradlin
/ci-test Allen!1127 (merged)
assigned to @rmatev
mentioned in commit 20b1df16