-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Thanks for open-sourcing this security benchmarking dataset! I have a few important questions on the datasets details and want to include your benchmark to my ongoing research project.
(1) What are the versions of the real-world repo for the vulnerable files? Under the cases folder, I can see that each data point only contains the identified "input" files that were annotated to cover the vulnerability. Since these open-source GitHub repositories are continuously being updated, could you share the commit/version of the code repositories so I can obtain the complete source code?
(2) Meanwhile, I saw that you have a meta-data file named data/vader.csv, which contains the Repository column. This column includes a mixture of GitHub repo urls (i.e. https://github.com/heli-toon/LBSHS-LMS) and specific file paths (i.e. https://github.com/kishanrajput23/Jarvis-Desktop-Voice-Assistant/blob/main/Jarvis/jarvis.py). I wonder if you can clean it up.
(3) What does the "before_and_after" column stand for in the file data/vader_languages_before_after.csv? Also, the "Case" column in this data/vader_languages_before_after.csv file starts with 2 and does not match with your cases folder.
(4) There seems to be mismatching of data within the cases folder. Could you please verify and validate all the shared files?
- For instance, case1's input and patch can be mapped to the
[test_plugin.py], which stands for case_2 inLine 3 in 52e29e1
The recursive input validation functions validd() and valid() in the code call themselves repeatedly whenever the user inputs an invalid book code. This unchecked recursion can cause the call stack to grow indefinitely if the user keeps entering invalid input, eventually exhausting the stack memory and causing a stack overflow error or a RecursionError in Python. data/vadar.csv. However, the case_1_tests.txt is relevant to the Library Management System. - Similarly, the inputs for case_2 is mapped to case_3 in
data/vadar.csv, and thetest.txtdoes not account for the SQL injection vulnerability.