Improve cross attention and neutral input handling
Main changes:
- some fixes for cross attention (CA) architecture
- configs for training GN2 with neutral and charged flows with SA and with CA
- merge_dict for merging several inputs after initialisation so the auxiliary tasks can be performed on all of them simultaneously
- also prepared cross-attention to be more onnx friendly
Edited by Ivan Oleksiyuk