Average Precision and Average Recall metrics reported by COCOeval seem to be incorrect

I'm not sure if this is the right place to report issues with https://github.com/ppwwyyxx/cocoapi -- that repo doesn't have its own Issues tab, so I'm opening an issue here instead.

I'm confused by how pycocotools calculates average precision and recall metrics reported in the summary.  I'm not sure if it's actually a bug, or if I'm just fundamentally misunderstanding how these calculations are being done under the hood.  So, I wrote out a super simple test case, just taking two bboxes with perfect overlap and passing them into COCOeval:
```python
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
 
actual_boxes = [[50, 50, 150, 150], [200, 200, 300, 300]]
predicted_boxes =  [[50, 50, 150, 150], [200, 200, 300, 300]]
scores = [1.0, 1.0]
coco_actual = COCO()
coco_predicted = COCO()
actual_annotations_list = []
predicted_annotations_list = []
for id, box in enumerate(actual_boxes):
    actual_annotations_list.append({
        "id": id,
        "image_id": 1,
        "category_id": 1,
        "bbox": [box[0], box[1], box[2] - box[0], box[3] - box[1]],
        "area": (box[2] - box[0]) * (box[3] - box[1]),
        "iscrowd": 0,
    })
for id, box in enumerate(predicted_boxes):
    predicted_annotations_list.append({
        "id": id,
        "image_id": 1,
        "category_id": 1,
        "bbox": [box[0], box[1], box[2] - box[0], box[3] - box[1]],
        "area": (box[2] - box[0]) * (box[3] - box[1]),
        "iscrowd": 0,
        "score": scores[id],
    })
coco_actual.dataset = {
    "images": [{"id": 1}],
    "annotations": actual_annotations_list,
    "categories": [{"id": 1, "name": "object"}],
}
coco_actual.createIndex()
coco_predicted.dataset = {
    "images": [{"id": 1}],
    "annotations": predicted_annotations_list,
    "categories": [{"id": 1, "name": "object"}],
}
coco_predicted.createIndex()
coco_eval = COCOeval(coco_actual, coco_predicted, iouType="bbox")
coco_eval.evaluate()
coco_eval.accumulate()
coco_eval.summarize()
```
Here is the output:
```
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.252
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.252
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.252
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.252
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.500
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.500
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.500
 ```
I believe these are considered "large", and the summary shows AP=0.252 and AR=0.500.  These numbers do not make sense to me.  Actual and predicted are 100% identical here, so we'd expect average precision and recall to both be 1.0, right?  Am I misunderstanding something, or is there a bug in how these metrics are calculated?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Average Precision and Average Recall metrics reported by COCOeval seem to be incorrect #672

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Average Precision and Average Recall metrics reported by COCOeval seem to be incorrect #672

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions