-
Nicole Hartman authored
I think this attention model is wrong b/c it's computing a different attn for each feature - pushing before editing
51613cc1
I think this attention model is wrong b/c it's computing a different attn for each feature - pushing before editing