This MR adds to things:
1: You can now define via wildcard multiple test files for evaluation.
2: The option evaluate_trained_tagger
, which is already available, is now in the DL1r config yaml.
3: Documentation for evaluating without a new trained tagger.