Add label pair functionality to contrastive loss function#1278
Add label pair functionality to contrastive loss function#1278jackculpepper wants to merge 9 commits intoBVLC:devfrom
Conversation
|
Nice generalization -- could you add another model to the siamese network example that makes use of this new feature? |
|
The code you linked is correct; using |
|
Heh..yeah. Thanks. Forgot my brain this morning. |
changes include code to generated shared png files based storage current setup should give roughly the same performance as before seems to achieve approximately the same performance could 'shuffle' in image_data_layer to generate extra train pairs
|
Here's a comparison on the MNIST "siamese" test error between the dev branch, and this branch with shuffle either off or on. When you turn it on, each epoch has a different same/not same pairing. Note that the test set is identical for "shuffle" and "no shuffle" but different for "dev" (different random pairing). This may be why "no shuffle" does worse than "dev". |
|
@jackculpepper thanks for this PR. I've been trying to use the contrastive loss on imagenet dataset with no success. It seems like the loss is stuck straight from the beginning in a local minima. Have you tried using it with on a "imagenet" scale network? |
|
@amiralush I haven't run it on imagenet yet, no. However, I have seen the problem you are describing. Instead of using a single contrastive loss at the top, you can put contrastive loss layers in the middle, too. Have you tried that? Incidentally, for anyone else following this thread, I merged the label pair func idea into the hinge loss and accuracy layers, too. I'd be happy to submit PRs for those if anyone's interested. |
|
@jackculpepper, all your changes are essential to implement a full-blown network using two kinds of supervised labels such as multi-class classification and pair-wise similarity [1]. Please share them too. Thank you very much! [1] Y. Sun, X. Wang, and X. Tang. Deep Learning Face Representation by Joint Identification-Verification. Technical report, arXiv:1406.4773, 2014. |
There was a problem hiding this comment.
To be cross platform, boost::filesystem::create_directories is a better option.
|
@jackculpepper a PR would be very helpful. |
add enum to params so that choice of label type is more explicit refactor label computation code into a private function add binary function generator and "caffe_cpu_same()" functions to math
|
@futurely Thank you for reviewing my code. All of your suggestions were good. I made the changes necessary to address them in the previous two commits. Note that using boost to create directories requires linking against boost_filesystem, so I had to add that to the Makefile. @amiralush The changes I am making in this branch will be easily transferred over to accuracy and hinge loss layers, so it's probably better to wait. However, here are two PRs off dev from about a week ago: |
|
The changes look good to me. The other two PRs are also very useful. Thanks! |
|
The "same" label can be computed using existing layers. Here is an example of how to use |

This PR adds functionality to the contrastive loss function.
Now, in addition to being able to pass a 1/0 label that specifies similar or dissimilar pairs, you can pass in two labels, and similarity will be calculated on the fly.
I do this by keeping another buffer,
similar_, which is computed at feed-forward time.This is useful, for example, if you want to combine a contrastive loss function with a soft-max loss, or if you want to use the same data loaders for both siamese or N-way classification networks.
The way I'm doing it does add a slight bit of additional overhead, but I think it's negligible.
If you hook up only one label, the 1/0 behavior still happens.
I also added a check to make sure there are either 3 or 4 bottom blobs.