GN2v01 - tweaks for improved performance and training times
- A few tweaks for the onnx export
- Moving to pre-ln style instead of normformer -> should improve throughput
- Switch to ReLU -> should improve throughput
- Switch to a wider, shallower architecture -> should improve throughput
- Fix for LRS, and warmup more quickly
Additionally performance is improved as observed by @npond -- many thanks for the tests
Edited by Samuel Van Stroud