-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
-
Problem:
In summarize , precision is calculated asmean_s = np.mean(s[s>-1]), where[s>-1]is supposed to filter out precision at invalid recall threshold levels. However, the precision values at all recall threshold levels are initialized as0. Later it doesn't update the values for precisions at invalid recall threshold levels, leaving them being0(not-1), which couldn't get filtered out when taking the average at summerize. -
Example: three ground truth bbox, and two predicted bbox, both of which match perfectly with two of the ground truth bbox. In this case, we are expecting average precision to be
1.0.
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
actual_boxes = [[50, 50, 150, 150], [200, 200, 300, 300], [350, 350, 450, 450]]
predicted_boxes = [[50, 50, 150, 150], [200, 200, 300, 300]]
scores = [1.0, 1.0]
coco_actual = COCO()
coco_predicted = COCO()
actual_annotations_list = []
predicted_annotations_list = []
for id, box in enumerate(actual_boxes):
actual_annotations_list.append({
"id": id+1,
"image_id": 1,
"category_id": 1,
"bbox": [box[0], box[1], box[2] - box[0], box[3] - box[1]],
"area": (box[2] - box[0]) * (box[3] - box[1]),
"iscrowd": 0,
})
for id, box in enumerate(predicted_boxes):
predicted_annotations_list.append({
"id": id+1,
"image_id": 1,
"category_id": 1,
"bbox": [box[0], box[1], box[2] - box[0], box[3] - box[1]],
"area": (box[2] - box[0]) * (box[3] - box[1]),
"iscrowd": 0,
"score": scores[id],
})
coco_actual.dataset = {
"images": [{"id": 1}],
"annotations": actual_annotations_list,
"categories": [{"id": 1, "name": "object"}],
}
coco_actual.createIndex()
coco_predicted.dataset = {
"images": [{"id": 1}],
"annotations": predicted_annotations_list,
"categories": [{"id": 1, "name": "object"}],
}
coco_predicted.createIndex()
coco_eval = COCOeval(coco_actual, coco_predicted, iouType="bbox")
coco_eval.evaluate()
coco_eval.accumulate()
coco_eval.summarize()
However the output, as show below, returns average precision
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.663
creating index...
index created!
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.00s).
Accumulating evaluation results...
DONE (t=0.00s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.663
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.663
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.663
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.663
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.333
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.667
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.667
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.667
- Solution:
initialize q to be np array with values-1at here
q = np.ones((R,))*(-1)