Skip to content

Draft: Try to vectorize newCovariance matrix. The one that should be faster

Try to vectorize newCovariance matrix.

This is similar to !36911 (merged).

In principle the new method we introduced in RK Utils. should be faster make used of more vectorized instructions

For sure will violate Tier 0 since we do not respect order of instructions so will change digits. In principle one could unify these with the other MR but then we change even more order of instruction

More or less what happened in !36911 (merged)

If we want to allow cpu improvement of x% and can live with same math but different order of operations this in principle is "fair game" Also a cohort of other become "fair game" on top of data structure and memory optimisation ones.

To understand what we try

  double a11 = (Jac[ 0]*V(0, 0)+Jac[ 1]*V(0, 1)+Jac[ 2]*V(0, 2))+(Jac[ 3]*V(0, 3)+Jac[ 4]*V(0, 4));
  double a12 = (Jac[ 0]*V(0, 1)+Jac[ 1]*V(1, 1)+Jac[ 2]*V(1, 2))+(Jac[ 3]*V(1, 3)+Jac[ 4]*V(1, 4));
  double a13 = (Jac[ 0]*V(0, 2)+Jac[ 1]*V(1, 2)+Jac[ 2]*V(2, 2))+(Jac[ 3]*V(2, 3)+Jac[ 4]*V(2, 4));
  double a14 = (Jac[ 0]*V(0, 3)+Jac[ 1]*V(1, 3)+Jac[ 2]*V(2, 3))+(Jac[ 3]*V(3, 3)+Jac[ 4]*V(3, 4));
  double a15 = (Jac[ 0]*V(0, 4)+Jac[ 1]*V(1, 4)+Jac[ 2]*V(2, 4))+(Jac[ 3]*V(3, 4)+Jac[ 4]*V(4, 4));

  rv.fillSymmetric(0, 0, (a11*Jac[ 0]+a12*Jac[ 1]+a13*Jac[ 2])+(a14*Jac[ 3]+a15*Jac[ 4]));

Let's (since V is symmetric) write as

  double a11 = (Jac[ 0]*V(0, 0)+Jac[ 1]*V(1, 0)+Jac[ 2]*V(2, 0))+(Jac[ 3]*V(3,0 )+Jac[ 4]*V(4, 0));
  double a12 = (Jac[ 0]*V(0, 1)+Jac[ 1]*V(1, 1)+Jac[ 2]*V(2, 1))+(Jac[ 3]*V(3, 1)+Jac[ 4]*V(4, 1));
  double a13 = (Jac[ 0]*V(0, 2)+Jac[ 1]*V(1, 2)+Jac[ 2]*V(2, 2))+(Jac[ 3]*V(2, 3)+Jac[ 4]*V(4, 2));
  double a14 = (Jac[ 0]*V(0, 3)+Jac[ 1]*V(1, 3)+Jac[ 2]*V(2, 3))+(Jac[ 3]*V(3, 3)+Jac[ 4]*V(4, 3));
  double a15 = (Jac[ 0]*V(0, 4)+Jac[ 1]*V(1, 4)+Jac[ 2]*V(2, 4))+(Jac[ 3]*V(3, 4)+Jac[ 4]*V(4, 4));

  rv.fillSymmetric(0, 0, (a11*Jac[ 0]+a12*Jac[ 1]+a13*Jac[ 2])+(a14*Jac[ 3]+a15*Jac[ 4]));

Then this a multiplication of 1x5 * 5x5 matric giving is an 1x5 which then we dot with the initial 1x5

Edited by Christos Anastopoulos

Merge request reports