Support for feature wise transformations
Implement alternate mechanism for model conditioning through feature-wise transformations, as described here https://distill.pub/2018/feature-wise-transformations/. This extends the current parameterisation implementation, which only includes the standard concatenation based approach, providing the user with greater flexibility, and possible performance gains.
-
Basic implementation with the option to add feature wise transformations to the inputs / global track representations -
Implement the option to apply feature wise transformations to each layer in the encoder, and investigate its performance