Draft: Try to vectorize newCovariance matrix. The one that should be faster
Try to vectorize newCovariance matrix.
This is similar to !36911 (merged).
In principle the new method we introduced in RK Utils. should be faster make used of more vectorized instructions
For sure will violate Tier 0 since we do not respect order of instructions so will change digits. In principle one could unify these with the other MR but then we change even more order of instruction
More or less what happened in !36911 (merged)
If we want to allow cpu
improvement of x% and can live with same math
but different order of operations this in principle is "fair game"
Also a cohort of other become "fair game" on top of data structure and memory optimisation ones.
To understand what we try
double a11 = (Jac[ 0]*V(0, 0)+Jac[ 1]*V(0, 1)+Jac[ 2]*V(0, 2))+(Jac[ 3]*V(0, 3)+Jac[ 4]*V(0, 4));
double a12 = (Jac[ 0]*V(0, 1)+Jac[ 1]*V(1, 1)+Jac[ 2]*V(1, 2))+(Jac[ 3]*V(1, 3)+Jac[ 4]*V(1, 4));
double a13 = (Jac[ 0]*V(0, 2)+Jac[ 1]*V(1, 2)+Jac[ 2]*V(2, 2))+(Jac[ 3]*V(2, 3)+Jac[ 4]*V(2, 4));
double a14 = (Jac[ 0]*V(0, 3)+Jac[ 1]*V(1, 3)+Jac[ 2]*V(2, 3))+(Jac[ 3]*V(3, 3)+Jac[ 4]*V(3, 4));
double a15 = (Jac[ 0]*V(0, 4)+Jac[ 1]*V(1, 4)+Jac[ 2]*V(2, 4))+(Jac[ 3]*V(3, 4)+Jac[ 4]*V(4, 4));
rv.fillSymmetric(0, 0, (a11*Jac[ 0]+a12*Jac[ 1]+a13*Jac[ 2])+(a14*Jac[ 3]+a15*Jac[ 4]));
Let's (since V is symmetric) write as
double a11 = (Jac[ 0]*V(0, 0)+Jac[ 1]*V(1, 0)+Jac[ 2]*V(2, 0))+(Jac[ 3]*V(3,0 )+Jac[ 4]*V(4, 0));
double a12 = (Jac[ 0]*V(0, 1)+Jac[ 1]*V(1, 1)+Jac[ 2]*V(2, 1))+(Jac[ 3]*V(3, 1)+Jac[ 4]*V(4, 1));
double a13 = (Jac[ 0]*V(0, 2)+Jac[ 1]*V(1, 2)+Jac[ 2]*V(2, 2))+(Jac[ 3]*V(2, 3)+Jac[ 4]*V(4, 2));
double a14 = (Jac[ 0]*V(0, 3)+Jac[ 1]*V(1, 3)+Jac[ 2]*V(2, 3))+(Jac[ 3]*V(3, 3)+Jac[ 4]*V(4, 3));
double a15 = (Jac[ 0]*V(0, 4)+Jac[ 1]*V(1, 4)+Jac[ 2]*V(2, 4))+(Jac[ 3]*V(3, 4)+Jac[ 4]*V(4, 4));
rv.fillSymmetric(0, 0, (a11*Jac[ 0]+a12*Jac[ 1]+a13*Jac[ 2])+(a14*Jac[ 3]+a15*Jac[ 4]));
Then this a multiplication of 1x5 * 5x5 matric giving is an 1x5 which then we dot with the initial 1x5