Use less shared memory in looking forward triplet seeding
The logic of triplet seeding has been divided into two kernels, lf_triplet_seeding
and lf_triplet_keep_best
. That way, less shared memory is used. Additionally:
- Calling number of threads optimized
- Shared memory usage in Tensor core code changed to require less resources