You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to analyze the margin loss function deeply. What I ended up doing was investigating the function with respect to the loser mse loss. Please look at the video I took below:
Recording.2026-01-25.221626.mp4
The problem is that the gradient for the loser stays the same (or very subtle change) when the winner changes. What I think is that the log sigmoid function doesn't stop or reduce the gradient somehow considering both the winner and the loser values. Could you please provide feedback on this?
So, the winner will always be minimized by the main loss. For the losers, they will be always maximized up to a point. I don't see any particularity about the outliers. It means that I can achieve the same wanted results with a basic loss function like this:
Hi @sayakpaul @kashif ,
I wanted to analyze the margin loss function deeply. What I ended up doing was investigating the function with respect to the loser mse loss. Please look at the video I took below:
Recording.2026-01-25.221626.mp4
The problem is that the gradient for the loser stays the same (or very subtle change) when the winner changes. What I think is that the log sigmoid function doesn't stop or reduce the gradient somehow considering both the winner and the loser values. Could you please provide feedback on this?
So, the winner will always be minimized by the main loss. For the losers, they will be always maximized up to a point. I don't see any particularity about the outliers. It means that I can achieve the same wanted results with a basic loss function like this:
Did I miss something?