Fix nans due to input masking
- Adding back input masking, broken in !161 (merged)
- Add check that model inputs are finite after masking
- Adding check for non-finite norm params
- Modify test inputs so that would be caught if it happens again
- Also fix a bug in the dataloader which caused more labels to be loaded than necessary
Edited by Samuel Van Stroud