Is sequence information leaking into the structure tokens? #129
Unanswered
OliviaViessmann
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I would like to discuss and understand how the ESM-3 structure tokens contain sequence information.
From the Appendix I understand that an inverse folding loss was added to the overall auto-reconstruction loss of the structure.
This inverse folding loss predicts a sequence from the tokens and the loss to the ground truth sequence is thus encoding sequence information into the learned structure tokens.
How is this not harmful for down-stream sequence generation tasks? Aka, is this not a form of leakage?
Are the sequence-structure pairs used for decoder training excluded from down-stream partial masking training?
From the appendix:
"Finally, an inverse folding token prediction loss (i.e., a crossentropy loss between predicted sequence and ground truth sequence) is an auxiliary loss used to encourage the learned representations to contain information pertinent to sequence related tasks. [...]
Inverse Folding Loss: Pass final layer representations of the decoder through a regression head to produce logits z. Using ground truth residues as labels y, compute cross-entropy for the classification task of predicting residues from final layer representations."
Beta Was this translation helpful? Give feedback.
All reactions