Possible missing count in "full" denominators


I am looking into [this block of eval_omni.py](https://github.com/ritzz-ai/GUI-R1/blob/d43eea2f1c52356077034ff2524642fc05e06c6a/guir1/eval/eval_omni.py#L85), the `if not k.endswith("full")` filtering looks causing problem: if a category e.g. action type or input_text totally missing (predicted as null or missing the field at all), the full_step / full_type won't add corresponding count, then the calculated hit rate will be higher than actual correct ones.

```
    for key in [k for k in score_dict.keys() if not k.endswith("full")]:
        if key.endswith("grounding"):
            full_step_hit+=score_dict[key]
            full_step+=score_dict[key+'_full']
            full_gr_hit+=score_dict[key]
            full_gr+=score_dict[key+'_full']
        elif key.endswith("text"):
            full_step_hit+=score_dict[key]
            full_step+=score_dict[key+'_full']
        else:
            full_type_hit+=score_dict[key]
            full_type+=score_dict[key+'_full']
        logger.info(f"Type {key} Length {score_dict[key+'_full']} : {(score_dict[key] / score_dict[key+'_full'])}")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible missing count in "full" denominators #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible missing count in "full" denominators #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions