Investigate Python library installation for container optimization

@jblomer has done a very nice analysis of CVMFS load times of the containers that can be summarized as

CVMFS saves around a factor of 10 in network traffic and worker node cache size.

Comparing cold cache to cold cache, starting from CVMFS is faster than downloading + starting the Singularity container when cvmfs is served from our test servers. From the CERN squids / stratum 1, cvmfs is significantly slower.

Comparing warm caches to warm caches, starting from CVMFS is significantly slower than starting from a local Docker container, although probably still in the acceptable range (<10s).

He also makes an interesting point in that

For the deployment of the ML containers, I wonder if tensorflow can be distributed as Python egg? That would collapse the several thousand files into one larger BLOB. While we usually argue for splitting BLOBs into smaller files, in this particular, quite stressful case I expect a bundle file to significantly speed up the import time.

This should be investigated.

This Issue is complementary to Issue #37

Edited Apr 08, 2019 by Matthew Feickert