Dual module processing
Closes https://gitlab.cern.ch/lhcb-parallelization/cuda_hlt/issues/10
- Search by triplet now processes two modules at a time instead of one. One from each side.
- fill_candidates and weak_tracks_adder have been created as separate algorithms in the sequence.