Stable Softmax
This change makes the softmax implementation numerically more stable using the shift invariance property of the softmax function. This example illustrates, that the current implementation can produce NaN if inputs are large: https://godbolt.org/z/j94xe6xbn
Edited by Vukan Jevtic