Label Smoothening
The generalization and learning speed of a multi-class neural network can often be significantly improved by using soft targets that are a weighted average of the hard targets and the uniform distribution over labels. Smoothing the labels in this way prevents the network from becoming over-confident and label smoothing has been used in many state-of-the-art models, including image classification, language translation and speech recognition.
For more information see here.
Technically this should be implemented using a flag defining the level of smoothing and which labels to apply it to.
Edited by Steffen Korn