[VECCORE-20] Implement AlignedAlloc() using _mm_malloc()
The C++ implementation of posix_memalign() throws exceptions, which unfortunately prevent optimizations performed by the Intel C/C++ compiler, especially auto-vectorization.
@swenzel, please test on Mac OS X, as I cannot currently do that and the Jenkins build only runs on Linux. This works for me with all compilers (GCC4, GCC5, Clang 3.8, and ICC 16.0.2).
Closes VECCORE-20.