PixelConditionsData: Backport non FT0 part of 63358
PixelConditionsData: Backport non FT0 part of !63358 (closed)
In this one we try to get the speed up while keeping the order of float operations intact ...
Note that here we keep the order of operations as 23.0
, FT0
and
we will call the same function as before
There are 2 bits where things were repeated with no reason
- This
TMath::Binomial(n, i)
we seem to ask the same things over and over again we could in cache 21 doubles and re-use.
- This
constexpr int n = 20;
constexpr int m = 20;
for (int i = 0; i <= n; ++i) {
for (int j = 0; j <= m; ++j) {
int k = (i * (m + 1)) + j;
if (P[k] < -998.9) continue;
r += bernstein_grundpolynom(u, n, i) * bernstein_grundpolynom(v, m, j) * P[k];
}
}
we transform to
//So here we calculate the 21+21 polynomial values we need
//for the inputs u , v for 0....n each (m==n)
std::array<double,n+1> bernstein_grundpolynomU;
std::array<double,m+1> bernstein_grundpolynomV;
for (int i = 0; i <= n; ++i) {
bernstein_grundpolynomU[i] = bernstein_grundpolynom<n>(u, i);
bernstein_grundpolynomV[i] = bernstein_grundpolynom<m>(v, i);
}
for (int i = 0; i <= n; ++i) {
for (int j = 0; j <= m; ++j) {
......
r += bernstein_grundpolynom_i * bernstein_grundpolynom_j * P[k];
}
}
More or less we call bernstein_grundpolynom
less times , not repeating the same 2x21
things (these stay the same) another x21 time so 2x21x21
in the inner loop ...
Notice these bits :
r += bernstein_grundpolynom_i * bernstein_grundpolynom_j * P[k];
return s_binomialCache[i] * TMath::Power(*t, i) * TMath::Power(1. - *t, N - i);
These had to stay in this order. Also we stick to TMath::Binomial
, so we can constexpr
some bits. Also we stick to TMath::Pow
which I think leads to wrapping the std::pow(double,double)
or so . To change this will be prb another time one by one as might change binary output.
In short , more or less here we just reduce the times we calculate certain terms.
But we keep their value the same and order they are added/multiplied the same to get exactl same binary outputs.