Define aggregates as std::vectors
This MR changes the way INPUT_AGGREGATES
are defined. Previously, each input aggregate would require a generated std::tuple
at configuration time to exist with the types used in the aggregate. This MR changes it so that effectively that requirement is gone and instead an input aggregate becomes basically a std::vector<ArgumentData>
with a frontend that makes it more akin to other Allen methods.
What does this enable
INPUT AGGREGATES can be defined as follows:
struct Parameters {
DEVICE_INPUT_AGGREGATE(dev_input_selections_t, bool) dev_input_selections;
DEVICE_INPUT_AGGREGATE(dev_input_selections_offsets_t, unsigned) dev_input_selections_offsets;
HOST_INPUT_AGGREGATE(host_input_post_scale_factors_t, float) host_input_post_scale_factors;
HOST_INPUT_AGGREGATE(host_input_post_scale_hashes_t, uint32_t) host_input_post_scale_hashes;
};
The above code defined four input aggregates, two on the device and two on the host. The types of the input aggregates should be consistent with the data that the various inputs it will be given will hold. For instance, the following excerpt from the configuration could be used to set a parameter:
make_algorithm(
name=algorithm_name,
dev_input_selections_t=[parameter_a, parameter_b, parameter_c],
...
)
As shown, each input aggregates accepts a list of parameters as an input. Each of the above parameters should be of the type indicated by the input aggregate. In this case, parameter_a
, parameter_b
and parameter_c
must be of type bool
(the type of dev_input_selections_t
).
As with other parameters, it is possible to access input aggregates in either set_arguments_size
or operator()
of the algorithm with a new function:
const auto input_ag = input_aggregate<dev_input_selections_t>(arguments);
The variable input_ag
will now contain an object of type InputAggregate<bool>
. This type exposes the following member functions:
-
size_t size_of_aggregate() const
-- Returns the size of the aggregate (ie. the size of the list). -
T* data(const int index) const
-- Returns the base pointer to the container at positionindex
. -
T first(const int index) const
-- Accesses the first element of the container at positionindex
. -
size_t size(const int index) const
-- Returns the size of container at positionindex
. -
gsl::span<T> span(const int index) const
-- Returns a span of container atindex
. -
std::string name(const int index) const
-- Name of container atindex
.
For a developed example please see https://gitlab.cern.ch/lhcb/Allen/-/blob/dcampora_aggregates_as_vectors/device/selections/Hlt1/src/GatherSelections.cu#L86 .
Backend simplifications
This MR simplifies significantly both the way input aggregates are defined and how to operate with them. It also greatly simplifies the backend:
- The
AllenSequenceGenerator.py
now only generates theSequence.h
file.InputAggregates.h
is not generated and required anymore. - All inputs, algorithms and inputs are now defined in a single configuration file
Sequence.h
. - This implies that the Stream target does not rely on
InputAggregates.h
. - It also means that no algorithm relies on
InputAggregates.h
anymore. This improves scalability and simplifies the requirements between targets. - This MR will enable not needing to compile certain algorithms for every sequence desired in runtime in !552 (merged) . Instead, only
Stream.cpp
is now required. - This MR will enable generation of the remaining Allen algorithms as Gaudi algorithms in !431 (closed).
- It will likely be a necessary step for a complete type-erased sequence.
-
Allen::copy
is now synchronous.Allen::copy_async
is asynchronous. -
set_size
andreduce_size
are now statically asserted to run over OUTPUT datatypes. -
first
is now statically asserted to run over HOST datatypes.
Changes
- Added templated
InputAggregate
class. - Defined macro
INPUT_AGGREGATE
and redefined macrosHOST_INPUT_AGGREGATE
andDEVICE_INPUT_AGGREGATE
to be specializations of it. - Created flexible
Allen::copy
functions, acceptinggsl::span
as inputs. Simplified other Allen copy utility functions. - Rewrote algorithm
gather_selections_t
. - Simplified
AllenSequenceGenerator.py
which doesn't have to produce InputAggregates anymore. That also affects itsgenerate_sequence
, which becomes more homogeneous. - Adapted
SchedulerMachinery.cpp
code to generateInputAggregate
objects on the fly.