In the paper, you state that the loss function in pre-training stage 2 is computed as $L=L_{MLM}+L_{SCL}+10*L_{VAT}$, but in the code you have: ```python loss = adv_loss + 10 * loss ``` Which version did you actually use?
In the paper, you state that the loss function in pre-training stage 2 is computed as$L=L_{MLM}+L_{SCL}+10*L_{VAT}$ , but in the code you have:
Which version did you actually use?