Update all example eval configs
Our default is now to evaluate on the valset (not the testset, which should be withheld until publication). So the configurations should now evaluate on a datasplit of [0, N, 0]
, rather than the current [0, 0, N]
.