Skip to content

Optimise fragment offset calculation for GPU batches

Roel Aaij requested to merge optimise_fragment_offsets into master

Also add an application that benchmarks fragment offset calculation

The speed of the mep_offsets function improves from 14.3 Hz to 648 Hz (45x speedup). 4 threads are now more than enough to calculate the offsets; 2 would probably still be fine. It seems that EB::get_padding( s, 1 << align ) is very slow.

requires Allen!1127 (merged)

FYI @gligorov @kaaricha

Edited by Roel Aaij

Merge request reports