Hashtable benchmarks
This project contains a set of benchmarks for some hashtable implementations. It reviews their speed and memory usage. The code was written to review the performance of NVRAM versus RAM in the context of EOS at the IT-DSS section of CERN openlab.
Building the code
To build the code, create a directory build
and run cmake
from there, i.e.:
mkdir build
cd build
cmake ../
make
Generating a log file
One can generate a (random) log file using the genlogfile
tool. This tool needs a configuration file. The header of this configuration file associates some keys to values. Possible keys are:
-
key_type
: specifies the type of the hashtable keys. Can be eitherlonglong
(unsigned 64-bit integer) orstring
. -
key_size
: ifkey_type
isstring
, this can be used to specify the size in bytes of the keys used. -
value_size
: specifies the size of values in the hash table, in bytes. -
entries
: the maxiumum number of entries in the hashtable. -
alphanumeric
: set totrue
when the values (and keys, when applicable) should be alphanumeric instead of random bytes. The default value isfalse
.
The header is followed by an empty line. Then, the instructions for load generation follow. These are comprised of an operation name and the number of times this particular operation needs to be executed. Possible operations are:
-
set
: Adds a random key/value-pair to the hashtable. Note that the key may already exist in the hashtable. -
get
: Retrieves a random value from the hashtable, using a (randomly selected) key that is already in the hashtable. -
get-random
: Retrieves a randomly generated key from the hashtable. Chances are this key does not exist. -
delete
anddelete-random
: Likeget
andget-random
respectively, but delete the key. -
iterate
: Iterates over all entries in the hashtable (in arbitrary order).
An example configuration file is:
key_type longlong
value_size 1000
entries 1000
alphanumeric true
set 50
get 10
delete 20
iterate 2
To generate a log file, specify the configuration file and the filename of the log file to be written:
./genlogfile log.conf log.bin
Benchmarking
To run a benchmark, invoke one of benchmark-string
or benchmark-longlong
(depending on the key type) with a log file and the interface to use, for example:
./benchmark-string log.bin ./PersistentHashtableInterface.so
Note that the ./
part is necessary for the executable to be able to find the library. Currently, the following interfaces are available:
-
StdMapInterface.so
usesstd::map
from the STL -
GoogleDenseMapInterface.so
usesgoogle::dense_hash_map
from Sparsehash (if the library is found by CMake) -
GoogleSparseMapInterface.so
usesgoogle::sparse_hash_map
from Sparsehash (if the library is found by CMake) -
PersistentHashtableInterface.so
uses C++ wrapper around a pure C hashtable, found in thePersistentHashtable
directory.
A benchmark will print the user- and kernel-time spent executing the directives from the log file, along with the memory consumed while doing so.
Optionally, a third argument can be provided, like so:
./benchmark-string log.bin ./PersistentHashtableInterface.so fingerprint.bin
This will write a CRC32 fingerprint of the hashtable after each operation to fingerprint.bin
. Currently, this is only implemented for PersistentHashtableInterface
, as it is meant to be used when validating memory sanity after a sudden crash. Be aware that the fingerprinting will notably slow down the operations.