-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Hi, thank you for releasing this great dataset!
I've been working with the dataset and noticed a discrepancy regarding the number of problems that contain test cases. According to the paper, approximately 32.5k problems include test cases. However, when I processed the dataset, I found the following:
- Total number of problems: 47,136
- Problems with test cases (my count): 26,955
- Problems with test cases (as stated in the paper): ~32,500
There is a gap of roughly 5.5k between my count and the number reported in the paper. I wanted to confirm whether:
- I might be using an incorrect method to identify problems with test cases, or
- There is a specific version/subset of the dataset I should be using, or
- The definition of "having test cases" differs from what I assumed.
Could you please clarify how the 32.5k figure was calculated? Any guidance on the correct way to filter for problems with test cases would be greatly appreciated.
Thanks in advance!
Metadata
Metadata
Assignees
Labels
No labels