Conversation
|
@jackculpepper I have have tried this layer in siamese network, it seem to work absolutely fine if I have 1 such layer in network (1 on each side of siamese network), but if I try to add another locally connected layer network stops to learn. Do I miss something? I have tried different combinations, inserting relu or another usual convolution layer between locally connected layers, vary learning rate, result is the same - accuracy drops and network does not learn, I see very small values at the end of network. |
|
@okn2020 Thanks for trying! Can you share your prototxt files? I just committed an example prototxt for mnist with two local layers chained together. It gets 98.53%. |
|
network is almost the same as in the paper you quoted, I am not sure about fillers and first MVN layer, maybe you could suggest right numbers or correct layers? name: "pt_train" layers { name: "dropout7"type: DROPOUTbottom: "features7"top: "features7"dropout_param {dropout_ratio: 0.5}#} layers { name: "dropout7_p"type: DROPOUTbottom: "features7_p"top: "features7_p"dropout_param {dropout_ratio: 0.5}#} layers { |
|
dropout was commented out |
|
@jackculpepper see prototxt above |
|
@jackculpepper "but if I replace l5 or l6 or both" - I mean l4 or l5 or both |
|
@jackculpepper solver: test_initialization: false debug_info: truenet: "train.prototxt" |
i also added another contrastive loss after the first local layer, because it seems to plateau otherwise
|
Thanks. Have you tried MNIST? I've had the problem you describe before. I am not 100% sure, but I don't think it's a bug in the local layer. I think it's a plateau. I just committed a version with two local layers and two contrastive losses. One contrastive loss is at the top. The other is after the first local layer, to try and pull those weights into a good regime. This kind of thing is described in [1]. It descends, but it does take a while. See the snippets from the log below. [1] Going deeper with convolutions. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. http://arxiv.org/abs/1409.4842 |
|
@jackculpepper No, I have not tried MNIST, currently trying to get as much as I can out of my data. I will try your approach with two losses, thank you! |
|
Yeah, I have used the same approach with k-way softmax. I hang k-way softmax losses off each of the local layers. I'll check in an example of this on the MNIST siamese network in my #1278, which is exactly the point of the changes I made there. Let's continue this conversation over there. |
|
@jackculpepper ok, just one more question regarding your latest commit of siamese example with two losses - shouldn't IP layers with 500 output be used as features? I think IP with 10 output were used for softmax in minst (?) and IP with 2 output are not needed (?). |
|
@shelhamer Could you take a look? I think this is pretty close to being ready to merge. I've integrated these examples to run: and |
|
The locally connected layer proposed here is a generalization of the convolution layer. The kernels in the original implementation all share the same set of parameters. Another extreme is each kernel has its own independent paremeters. In the middle ground, parameters are shared by the kernels in a local region [3]. Using these layers together carefully, near perfect 99.15% face verificatoin accuracy is achieved on the classic LFW dataset showing that the new method is much better than DeepFace. Therefore, the ConvolutionLayer should be extended to support all the cases in a configurable way. [3] Y. Sun, X. Wang, and X. Tang. Deep Learning Face Representation by Joint Identification-Verification. Technical report, arXiv:1406.4773, 2014. |
|
How to set weights are totall unshared? |
|
@jackculpepper I come across the question like Maybe Because my categories is too much.In DeepFace:Closing the Gap to Human-Level Performance in Face Verification,There are many categories.Please help me! |
|
Hi, will this PR be merged? |
|
The speed seems a little slower than fb version. In my exp, jackculpepper's version takes ~700ms to extract feature, but fb papers says it only takes ~180ms using simd and caches. Why? |
|
Should one need to update the caffe to use layers like "LOCAL_WEIGHTED_CONVOLUTION" ?? And is this layer the same as the one used by DeepFace for locally connected layer as: L4, L5, L6?? Thanks in advance!! |
Locally connected layer Conflicts: src/caffe/layer_factory.cpp src/caffe/proto/caffe.proto
|
@jackculpepper thank you for your projects about locally connected layer. i also find and i want to develop as described in [1].
but this examples just know about difference conv layers and locally layers. this train.prototxt is below.. name: "FaceNet" layers{ layers{ layers{ layers{ layers{ layers{ layers { |
|
@jackculpepper @okn2020 @futurely @tjusxh @rmanor Has any body comes into the speed issue? I build a model that has 2 conv layers and 2 local layers, and use a K80 GPU, but the training speed is very slow, only 50 epoch in 3 minutes, it seems a bit wired... Here is my prototxt for CIFAR-10 |
|
Hi, what's the status on this issue? What does the I'm interested in merging this into the master branch to use some of the features there. I'm happy to help with the merge, but I'm not familiar with Caffe's internals. -Brandon. |
|
I want to use the locally connected layers. Can anyone help with a merge ? -Swami |
|
Closing since the dev branch is deprecated. Please send PRs to master. |
|
@jackculpepper I go though your codes and merged it to the caffe(2015.05 version). I wanna know the difference between your local layer from the convolution layer in codes. Are the parameters of caffe_cpu_gemm different? |
|
@jackculpepper : I an unable to build this PR. Any check on this?? |
|
I got it resolved, I used the build below and it worked. |
|
@jackculpepper [1] http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6247968 |
|
It seems that the implemention does not support the local-conv manner you mentioned and this local convolution may be what you need.@tanfei2007 |
|
in caffe, can we stack splitsLayer,cropLayer,convolutionLayer,flattenLayer, concatLayer to produce locally convolution layer? |
1 similar comment
|
in caffe, can we stack splitsLayer,cropLayer,convolutionLayer,flattenLayer, concatLayer to produce locally convolution layer? |
This PR implements a locally connected layer, as described in [1], for example.
It's similar to convolution, but there is no weight sharing.
Ye Lu and I worked on this together. Working code is mostly his. Bugs are mine.
We wrote this a while back, before the convolution layer supported non-square filters and strides. I can add support for that if there is interest.
Looking forward to reading your comments.
[1] Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf. DeepFace: Closing the Gap to Human-Level Performance in Face Verification. CVPR 2014. pdf