Dynamic scheduler
A dynamic scheduler is now used for assigning and managing the GPU memory. A chunk of data of a configurable size is malloc'ed only once at the initialization of the Stream, and it is used throughout the sequence to virtually allocate and free space as required.
- A memory manager has been added to manage the space allocated on the GPU. It can free memory and allocate memory, returning just the offset of the alloc'ed memory.
- BaseDynamicScheduler is a simple dynamic scheduler that is fed with the data dependencies for all algorithms, and determines upon each sequence step what space to free or allocate.
The code in the stream folder has been duly refactored into the following folders:
- gear contains all metaprogramming tricks for Sequence, tuple indices checking and arguments.
- handlers contains the Handler machinery and custom Handlers defined by the developers.
- memory_manager is where all variants of memory management are stored. For the moment, a single one.
- scheduler contains the possible schedulers. The static scheduler is kind of deprecated, but the dynamic one could be extended.
- sequence contains the bulk of the sequence execution.
- sequence_setup is intended to be modified upon adding new algorithms.
Merge request reports
Activity
52 52 std::string folder_name_raw; 53 53 std::string folder_name_MC = ""; 54 54 uint number_of_files = 0; 55 uint tbb_threads = 3; 56 uint number_of_repetitions = 10; 55 uint tbb_threads = 1; 56 uint number_of_repetitions = 1; 57 57 uint verbosity = 3; 58 58 bool print_individual_rates = false; 1 #include "../include/StreamWrapper.cuh" Because the actual kernel invocations were in a
.cu
file. However, due to templatization, I had to move the kernel invocations to a.cuh
file, and the kernel invocations need to be compiled by nvcc (cannot be compiled by anything else).One solution was to make main a
main.cu
file. However, that came with a lot of problems regarding tbb. Instead, I am using aStreamWrapper
now to circumvent this necessity, using a forward declaration of Stream and a pointer.
assigned to @dovombru
added 1 commit
- b0946d19 - Updated requirements in readme.md. Added C to the project() directive in CMake.
133 133 134 134 return sequence_dependencies; 135 135 } 136 137 std::vector<int> get_sequence_output_arguments() { Maybe this could be added to the readme_cuda_developer? i.e. if one wants the argument to be available until the end, it has to be added here.
Why did you chose these variables explicitly to be saved until the end? Just as an example / a test for now? Or so that the track object can be extended and other tracks added?
- stream/sequence_setup/src/SequenceSetup.cu 0 → 100644
115 sequence_dependencies[seq::prefix_sum_single_block_velo_track_hit_number] = { 116 arg::dev_velo_track_hit_number, 117 arg::dev_prefix_sum_auxiliary_array_2 118 }; 119 sequence_dependencies[seq::prefix_sum_scan_velo_track_hit_number] = { 120 arg::dev_velo_track_hit_number, 121 arg::dev_prefix_sum_auxiliary_array_2 122 }; 123 sequence_dependencies[seq::consolidate_tracks] = { 124 arg::dev_atomics_storage, 125 arg::dev_tracks, 126 arg::dev_velo_track_hit_number, 127 arg::dev_velo_cluster_container, 128 arg::dev_estimated_input_size, 129 arg::dev_module_cluster_num, 130 arg::dev_velo_track_hits, - stream/sequence_setup/src/SequenceSetup.cu 0 → 100644
110 }; 111 sequence_dependencies[seq::prefix_sum_reduce_velo_track_hit_number] = { 112 arg::dev_velo_track_hit_number, 113 arg::dev_prefix_sum_auxiliary_array_2 114 }; 115 sequence_dependencies[seq::prefix_sum_single_block_velo_track_hit_number] = { 116 arg::dev_velo_track_hit_number, 117 arg::dev_prefix_sum_auxiliary_array_2 118 }; 119 sequence_dependencies[seq::prefix_sum_scan_velo_track_hit_number] = { 120 arg::dev_velo_track_hit_number, 121 arg::dev_prefix_sum_auxiliary_array_2 122 }; 123 sequence_dependencies[seq::consolidate_tracks] = { 124 arg::dev_atomics_storage, 125 arg::dev_tracks, We should probably call these tracks dev_velo_tracks or something like that since they are of Velo specific type. In its current form, the VeloTracking::TrackHits object cannot be re-used for other tracking algorithms since its length is set according to the number of velo modules. We could maybe template it with the max_track_size and then re-use it in the VeloUT and SciFi cases when collecting hits belonging to a track. But this should probably wait until we have a first / second version of the VeloUT algorithm on the GPU. For now, the name should tell us to which algorithm this variable belongs.
After
track consolidation
, there are no moreTrackHits
objects in memory. Actually, what is left is:- The prefix sum (accumulated sum) of number of tracks for every track (in dev_atomics_storage).
- The prefix sum of numHits for each track (in dev_velo_track_hit_number).
- All the Hits, ordered (in dev_velo_track_hits).
- All the closeToBeamLine states, ordered (in dev_velo_states).
I would suggest in subsequent tracking algorithms, we do something similar:
- Create a TrackHits type for that algorithm, ie. UTTrackHits, that only has space for the trackhits in the UT.
- Add a field in the UTTrackHits to associate it with a Velo track.
At the time of consolidation of UTTracks, we would have to evaluate at least two alternatives:
- Consolidate Velo+UT Tracks.
- Consolidate just UT Tracks, and have an array of indirection UT Track index -> Velo Track index.
mentioned in commit 8b5b37ef