Steps to reproduce
When prepare the file assigned to FILE_projects_list in config.ini, DO NOT end with an empty line:
path/to/project1.zip
path/to/project2.zip
Run python tokenizer.py zip, then check the log file and see:
[INFO] (MainThread) Starting zip project <1, path/to/project1.zip> (process 0)
...
[INFO] (MainThread) Starting zip project <2, path/to/project2.zi> (process 0)
The path of the last project is handled incorrectly which results in project not found.
This may caused by proj_paths.append(line[:-1]) in tokenizers/file-level/tokenizer.py .
Recommend to use line.strip() instead of line[:-1].
Steps to reproduce
When prepare the file assigned to
FILE_projects_listinconfig.ini, DO NOT end with an empty line:Run
python tokenizer.py zip, then check the log file and see:The path of the last project is handled incorrectly which results in project not found.
This may caused by
proj_paths.append(line[:-1])in tokenizers/file-level/tokenizer.py .Recommend to use
line.strip()instead ofline[:-1].