Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

PintS

Portable intrinsics for SIMD

Doxygen documentation: http://lemaitre-thesis.github.io/pints/

What is it?

PintS is an abstraction library to write SIMD code. It is fully compatible with both C and C++ and it is a header only library.

The main goal is to provide a library to have the same code on multiple SIMD architecture or even without SIMD.

How does it work?

The library consists in definitions of the functions for non SIMD architectures, and specialization of some of them for SIMD architectures. These specializations are done by undefining the previous macro (which can be an alias to a function) and redefining it for the target architecture.

The library is able to detect the target architecture using information given by the compiler and loads automatically the required header files.

How does it look?

float dotprod_v8f(const float* a, const float* b, int n) {
  // Local var declarations
  int i;
  v8f va, vb, s;
  m8f vmask;

  // initialization
  s = v8f_zero();

  // main loop
  for (i = 0; i < n - 8; i += 8) {
    va = v8f_loadu(&a[i]);
    vb = v8f_loadu(&b[i]);
    s = v8f_fmadd(va, vb, s);

  } /* remainder */ {
    vmask = m8f_cvt_m8i(m8i_gt(v8i_set1(n-i), v8i_set(0, 1, 2, 3, 4, 5, 6, 7)));
    va = v8f_loadu(&a[i]);
    vb = v8f_loadu(&b[i]);
    va = v8f_maskz_move(vmask, va);
    s = v8f_fmadd(va, vb, s);
  }

  // reduce s
  return sf_reduce_add_v8f(s);
}

How can I use it in my project?

You need to copy the folder include of this repository into your project and include the file pints.h wherever you want to use it.

Supported compilers

This library has only been tested on the following compilers:

  • gcc (5.1)
  • icc (16.0)

However, this library is written with compatibility in mind and should also be compatible with other compilers

Supported architectures

For now, the supported architectures are:

  • SSE
  • SSE2
  • SSE3
  • SSSE3
  • SSE4.1
  • SSE4.2
  • AVX
  • AVX2
  • FMA (AVX)

It is planned to also support in the future:

  • KNC
  • AVX512 (all variants)
  • Neon

Future features

It is planned to add the following features to the library:

  • Reductions (eg: full horizontal sum)
  • Transpositions (of vectors)
  • Masks primitives (for AVX512 and KNC support)
  • Shuffles with vector of indices
  • per instruction Rounding method? (only when supported)

Disclaimer

This project is still in development and might not be stable enough for a production use.