Optimize interpolated B field
This is my work on #384 (closed) . On my machine, a microbenchmark that hammers getField() in a loop goes from ~500ns/iteration to ~50ns/iteration with these changes. I think going beyond that would require more significant interface changes than I'm willing to commit to for now.
Quick summary of the work done:
- Replace dynamically allocated
std::set
s with lazy iterables that produce the same values. - Where heap allocation is useful, prefer
std::vector
overstd::set
- Avoid some unnecessary ping-poing between local and global grid coordinates
- Remove redundant array bound check
- Replace expensive trig functions with judicious use of trig identities (similar to earlier Jacobian work)
This reminds me that we really need some serious magnetic field benchmarks, I'll try to write a couple when I have more time.
Fixes #384 (closed) .
Merge request reports
Activity
- Resolved by Hadrien Benjamin Grasland
- Resolved by Hadrien Benjamin Grasland
- Resolved by Hadrien Benjamin Grasland
- Resolved by Andreas Salzburger
- Resolved by Andreas Salzburger
- Resolved by Hadrien Benjamin Grasland
- Resolved by Hadrien Benjamin Grasland
- Resolved by Hadrien Benjamin Grasland
Probably I should rename one of the
getLocalBinIndices
overloads to something else. Any opinion on which overload I should rename and what I should rename it to?Edited by Hadrien Benjamin Grasland- Resolved by Hadrien Benjamin Grasland
Yup. In a nusthell, the problem is that
getLocalBinIndices
, likegetGlobalBin
, has two overloads. One takes a global bin index and converts it into local bin indices, the other takes a point on the grid and gets the local bin indices by finding the local bin associated with each coordinate.The problem is that "point on the grid" overload is templated in a very general fashion...
template <class Point> index_t getLocalBinIndices(const Point& point) const;
...so this overload is very greedy and will be selected in place of the "global bin index" version, which has completely different semantics:
index_t getLocalBinIndices(size_t bin) const;
The reason I didn't hit this in my local tests is that I made the getLocalBinIndices change rather late during development and forgot to re-do a full build as I didn't expect this change to have a large impact.
IMO, the problem is that catch-all templates and overloading do not mix well, and there are two basic ways to resolve this:
- Use a less generic template parameter for points, e.g. only accept ActsVector.
- Rename one of the functions, i.e. do not overload here.
I think actually the specific
size_t
overload wins over the templated overload, since it's a closer match: http://cpp.sh/6yq6e