gitlab CI integration and regular performance benchmarks
Build and test run is triggered in gitlab CI on every commit to master or other branches that are explicitly added to .gitlab-ci.yml
Also adds two new options to cmake:
OVERIDE_ARCH_FLAG: Allows to override the architecture flag with any other compiler flags. Currently used to explicitly build for archs with tensor units and without
GITLAB_FAST_COMPILE: If this is set to "YES" the C++ part of the process will be compiled with -march=ivybridge. This allows the resulting executable to be run on a number of platforms and does not limit the build to only the CPU of the build machine. This is currently used to build Allen only once and then use the CI integration to disseminate it to the worker nodes with the various GPUs to run performance tests
Also adds a range of small python programs which analyze the performance of the current build and post results to mattermost and grafana.