-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.md.bak
More file actions
186 lines (168 loc) · 5.18 KB
/
README.md.bak
File metadata and controls
186 lines (168 loc) · 5.18 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
# High quality, fast, modular reference implementation of SSD in PyTorch 1.0
# Great thanks to the contribution of https://github.com/lufficc/SSD
This repository implements [SSD (Single Shot MultiBox Detector)](https://arxiv.org/abs/1512.02325). The implementation is heavily influenced by the projects [ssd.pytorch](https://github.com/amdegroot/ssd.pytorch), [pytorch-ssd](https://github.com/qfgaohao/pytorch-ssd) and [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark). This repository aims to be the code base for researches based on SSD.
## Highlights
- PyTorch 1.0
- GPU/CPU NMS
- Multi-GPU training and inference
- Modular
- Visualization(Support Tensorboard)
- CPU support for inference
## Installation
### Requirements
1. Python3
1. PyTorch 1.0
1. yacs
1. GCC >= 4.9
1. OpenCV
### Build
```bash
# build nms
cd ext
python build.py build_ext develop
```
## Train
### Setting Up Datasets
#### Pascal VOC
For Pascal VOC dataset, make the folder structure like this:
```
VOC_ROOT
|__ VOC2007
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ VOC2012
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ ...
```
Where `VOC_ROOT` default is `datasets` folder in current project, you can create symlinks to `datasets` or `export VOC_ROOT="/path/to/voc_root"`.
#### COCO
For COCO dataset, make the folder structure like this:
```
COCO_ROOT
|__ annotations
|_ instances_valminusminival2014.json
|_ instances_minival2014.json
|_ instances_train2014.json
|_ instances_val2014.json
|_ ...
|__ train2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ val2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ ...
```
Where `COCO_ROOT` default is `datasets` folder in current project, you can create symlinks to `datasets` or `export COCO_ROOT="/path/to/coco_root"`.
### Single GPU training
```bash
# for example, train SSD300:
python train_ssd.py --config-file configs/ssd300_voc0712.yaml --vgg vgg16_reducedfc.pth
```
### Multi-GPU training
```bash
# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train_ssd.py --config-file configs/ssd300_voc0712.yaml --vgg vgg16_reducedfc.pth
```
The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, ...) will also changed according to this paper: [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677).
The pre-trained vgg weights can be downloaded here: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth.
## Evaluate
### Single GPU evaluating
```bash
# for example, evaluate SSD300:
python eval_ssd.py --config-file configs/ssd300_voc0712.yaml --weights /path/to/trained_ssd300_weights.pth
```
### Multi-GPU evaluating
```bash
# for example, evaluate SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS eval_ssd.py --config-file configs/ssd300_voc0712.yaml --weights /path/to/trained_ssd300_weights.pth
```
## Demo
Predicting image in a folder is simple:
```bash
python demo.py --config-file configs/ssd300_voc0712.yaml --weights path/to/trained/weights.pth --images_dir demo
```
Then the predicted images with boxes, scores and label names will saved to `demo/result` folder.
Currently, I provide weights trained as follows:
| | Weights |
| :-----: | :----------: |
| SSD300* | [ssd300_voc0712_mAP77.83.pth(100 MB)](https://github.com/lufficc/SSD/releases/download/v1.0/ssd300_voc0712_mAP77.83.pth) |
| SSD512* | [ssd512_voc0712_mAP80.25.pth(104 MB)](https://github.com/lufficc/SSD/releases/download/v1.0/ssd512_voc0712_mAP80.25.pth) |
## Performance
### Origin Paper:
| | VOC2007 test |
| :-----: | :----------: |
| SSD300* | 77.2 |
| SSD512* | 79.8 |
### Our Implementation:
| | VOC2007 test |
| :-----: | :----------: |
| SSD300* | 77.8 |
| SSD512* | 80.2 |
### Details:
<table>
<thead>
<tr>
<th></th>
<th>VOC2007 test</th>
</tr>
</thead>
<tbody>
<tr>
<td>SSD300*</td>
<td><pre><code>mAP: 0.7783
aeroplane : 0.8252
bicycle : 0.8445
bird : 0.7597
boat : 0.7102
bottle : 0.5275
bus : 0.8643
car : 0.8660
cat : 0.8741
chair : 0.6179
cow : 0.8279
diningtable : 0.7862
dog : 0.8519
horse : 0.8630
motorbike : 0.8515
person : 0.8024
pottedplant : 0.5079
sheep : 0.7685
sofa : 0.7926
train : 0.8704
tvmonitor : 0.7554</code></pre></td>
</tr>
<tr>
<td>SSD512*</td>
<td><pre><code>mAP: 0.8025
aeroplane : 0.8582
bicycle : 0.8710
bird : 0.8192
boat : 0.7410
bottle : 0.5894
bus : 0.8755
car : 0.8856
cat : 0.8926
chair : 0.6589
cow : 0.8634
diningtable : 0.7676
dog : 0.8707
horse : 0.8806
motorbike : 0.8512
person : 0.8316
pottedplant : 0.5238
sheep : 0.8191
sofa : 0.7915
train : 0.8735
tvmonitor : 0.7866</code></pre></td>
</tr>
</tbody></table>