Dependencies: LHCb!3430 (merged), Analysis!871 (merged), Moore!1416 (merged)
@nnolte's and my attempt to replace cling.
This introduces a new implementation of
IFactory which uses the configured project compiler (gcc or clang) to jit compile functors.
We reproduced the timings discussed in #266 (closed):
#Functors Cling GCC 1 10s 12s 10 74s 15s
Scaling of GCC is much better
Note that the GCC JIT compilation properly uses the
-O3 flag, so the result is properly optimized while this isn't the case for cling.
Also we will hopefully be able to use vectorization in jit compiled functors, meaning that we will eventually be able to deprecate the additional
complications in the functor backend that deal with "SIMD loop, but scalar if jitted"
On the internal workings:
Instead of using cling to compile the generated c++ code that comes from the python functors, we use the native compiler used in CMAKE.
We do this in the following steps:
in the cmake step
- build a python script (functor_jitter) that gets the correct compiler command (from cmake) to compile a functor, i.e. compiler path, build flags, compile definitions and so on.
- Preprocess a header with all includes we need to compile a functor. This preprocessed header will be used for functor compilation to make that compilation independent of the system we compile on.
initialize() of algorithms:
- algorithm registers its functors with the FunctorFactory
1.a if the functor is found in the functor cache, the algorithm's functor is initialized
1.b no cache hit -> register functor for jit compilation
start() call of the service
- Get C++ code for all registered functors
- Put it in N temporary cpp files (where is defined by the
- Compile the temporary cpp file with our functor_jitter script and preprocessed header into a shared library. This will use
m_jit_n_jobsprocesses to compile the created files in parallel.
- Load temporary library with dlopen
- For each registered functors, retrieve function from library and initialize functor.
Note, before step 3. the functor factory does check if the shared library already exists. If so it simply loads the existing lib. This is a huge gain for interactive use where during development you rerun the same options file over and over as it is now unnecessary for the user to figure out how to add his options to a functor cache to avoid these recurring JIT compile times.
There is now also a pretty detailed overview of how things work inside the file: