Add label pair functionality to contrastive loss function by jackculpepper · Pull Request #1278 · BVLC/caffe

jackculpepper · 2014-10-14T20:36:06Z

This PR adds functionality to the contrastive loss function.

Now, in addition to being able to pass a 1/0 label that specifies similar or dissimilar pairs, you can pass in two labels, and similarity will be calculated on the fly.

I do this by keeping another buffer, similar_, which is computed at feed-forward time.

This is useful, for example, if you want to combine a contrastive loss function with a soft-max loss, or if you want to use the same data loaders for both siamese or N-way classification networks.

The way I'm doing it does add a slight bit of additional overhead, but I think it's negligible.

If you hook up only one label, the 1/0 behavior still happens.

I also added a check to make sure there are either 3 or 4 bottom blobs.

shelhamer · 2014-10-14T20:40:42Z

Nice generalization -- could you add another model to the siamese network example that makes use of this new feature?

jackculpepper · 2014-10-15T18:56:54Z

These lines:

here, here, here, and here

..should they be .gpu_data() instead of .cpu_data()?

No.

(Removed my earlier comments to prevent someone from reading this and getting confused.)

jeffdonahue · 2014-10-15T19:01:01Z

The code you linked is correct; using gpu_data instead would cause a segfault. gpu_data can only be read/written inside of a GPU kernel; all those examples (or at least the first two I checked) have immediate read/writes of the data from the host code and thus need to use cpu_data.

jackculpepper · 2014-10-15T19:51:20Z

Heh..yeah. Thanks. Forgot my brain this morning.

changes include code to generated shared png files based storage current setup should give roughly the same performance as before seems to achieve approximately the same performance could 'shuffle' in image_data_layer to generate extra train pairs

jackculpepper · 2014-10-16T21:41:13Z

Here's a comparison on the MNIST "siamese" test error between the dev branch, and this branch with shuffle either off or on. When you turn it on, each epoch has a different same/not same pairing.

Note that the test set is identical for "shuffle" and "no shuffle" but different for "dev" (different random pairing). This may be why "no shuffle" does worse than "dev".

amiralush · 2014-10-20T17:38:05Z

@jackculpepper thanks for this PR. I've been trying to use the contrastive loss on imagenet dataset with no success. It seems like the loss is stuck straight from the beginning in a local minima. Have you tried using it with on a "imagenet" scale network?

jackculpepper · 2014-10-20T20:26:02Z

@amiralush I haven't run it on imagenet yet, no. However, I have seen the problem you are describing. Instead of using a single contrastive loss at the top, you can put contrastive loss layers in the middle, too. Have you tried that?

Incidentally, for anyone else following this thread, I merged the label pair func idea into the hinge loss and accuracy layers, too. I'd be happy to submit PRs for those if anyone's interested.

futurely · 2014-10-21T12:26:41Z

@jackculpepper, all your changes are essential to implement a full-blown network using two kinds of supervised labels such as multi-class classification and pair-wise similarity [1]. Please share them too. Thank you very much!

[1] Y. Sun, X. Wang, and X. Tang. Deep Learning Face Representation by Joint Identification-Verification. Technical report, arXiv:1406.4773, 2014.

futurely · 2014-10-21T12:32:30Z

examples/mnist/convert_mnist_data.cpp

To be cross platform, boost::filesystem::create_directories is a better option.

amiralush · 2014-10-21T18:39:42Z

@jackculpepper a PR would be very helpful.
Concerning your suggestion to use more than a single loss along the Net. Using imagenet architecture, I've tried adding an additional contrastive loss on fc6 which caused the second loss (on fc8) to output Nan after a couple of iterations but the first loss seems to be going along OK. I don't fully yet understand it.

add enum to params so that choice of label type is more explicit refactor label computation code into a private function add binary function generator and "caffe_cpu_same()" functions to math

jackculpepper · 2014-10-23T02:03:50Z

@futurely Thank you for reviewing my code. All of your suggestions were good. I made the changes necessary to address them in the previous two commits. Note that using boost to create directories requires linking against boost_filesystem, so I had to add that to the Makefile.

@amiralush The changes I am making in this branch will be easily transferred over to accuracy and hinge loss layers, so it's probably better to wait. However, here are two PRs off dev from about a week ago:

accuracy
hinge

kloudkl · 2014-10-23T17:34:05Z

The changes look good to me. The other two PRs are also very useful. Thanks!

jackculpepper · 2014-10-30T01:34:12Z

The "same" label can be computed using existing layers. Here is an example of how to use THRESHOLD and ELTWISE to compute the same/not same label from a pair of k-way labels:

layers {
  name: "diff_label_ab"
  type: ELTWISE
  eltwise_param {
    coeff: 1
    coeff: -1
  }
  bottom: "label_a"
  bottom: "label_b"
  top: "diff_label_ab"
}

layers {
  name: "diff_label_ba"
  type: ELTWISE
  eltwise_param {
    coeff: 1
    coeff: -1
  }
  bottom: "label_b"
  bottom: "label_a"
  top: "diff_label_ba"
}

layers {
  name: "diff_label_ab_thresh"
  type: THRESHOLD
  bottom: "diff_label_ab"
  top: "diff_label_ab_thresh"
}

layers {
  name: "diff_label_ba_thresh"
  type: THRESHOLD
  bottom: "diff_label_ba"
  top: "diff_label_ba_thresh"
}

layers {
  name: "label_same"
  type: ELTWISE
  bottom: "diff_label_ab_thresh"
  bottom: "diff_label_ba_thresh"
  top: "label_same"
}

allow two labels to be sent into contrastive loss

7200d07

jackculpepper and others added 2 commits October 14, 2014 20:57

fix lint

bb5d544

Sigh..lint

134b44f

fix gpu/cpu bug, do not cast when setting similar_

5698756

jackculpepper added 2 commits October 15, 2014 22:48

shuffle training data

f2a0c9e

specify gpu

4bbfea4

sergeyk force-pushed the dev branch from 2fb4c97 to 1718903 Compare October 17, 2014 18:44

jackculpepper mentioned this pull request Oct 17, 2014

Locally connected layer #1271

Closed

futurely reviewed Oct 21, 2014
View reviewed changes

examples/mnist/convert_mnist_data.cpp Outdated

Copy link
Copy Markdown

futurely Oct 21, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be cross platform, boost::filesystem::create_directories is a better option.

jackculpepper added 2 commits October 23, 2014 01:47

use boost to create directories

fd97c2e

rename "similar_" field to "label_"

52ec746

add enum to params so that choice of label type is more explicit refactor label computation code into a private function add binary function generator and "caffe_cpu_same()" functions to math

jackculpepper closed this Oct 30, 2014

Conversation

jackculpepper commented Oct 14, 2014

Uh oh!

shelhamer commented Oct 14, 2014

Uh oh!

jackculpepper commented Oct 15, 2014

Uh oh!

jeffdonahue commented Oct 15, 2014

Uh oh!

jackculpepper commented Oct 15, 2014

Uh oh!

jackculpepper commented Oct 16, 2014

Uh oh!

amiralush commented Oct 20, 2014

Uh oh!

jackculpepper commented Oct 20, 2014

Uh oh!

futurely commented Oct 21, 2014

Uh oh!

futurely Oct 21, 2014

Choose a reason for hiding this comment

Uh oh!

amiralush commented Oct 21, 2014

Uh oh!

jackculpepper commented Oct 23, 2014

Uh oh!

kloudkl commented Oct 23, 2014

Uh oh!

jackculpepper commented Oct 30, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants