I am exploring machine translation for Sumerian and trying to parse atf files using pyorrac and cdli/atf2tei parsers instead of writing my own, and even the parser.py that was in this repo from a previous pull request, but nothing seems to work correctly and all of them throw errors. Is something wrong with the corpus? If not, how can I fix it without having to manually dig out all the problems?
After fixing a lot of of "?" marks at the end of @ broken or other signifiers, most of the problems are empty entries. Any way to fix this?
I am exploring machine translation for Sumerian and trying to parse atf files using pyorrac and cdli/atf2tei parsers instead of writing my own, and even the parser.py that was in this repo from a previous pull request, but nothing seems to work correctly and all of them throw errors. Is something wrong with the corpus? If not, how can I fix it without having to manually dig out all the problems?
After fixing a lot of of "?" marks at the end of @ broken or other signifiers, most of the problems are empty entries. Any way to fix this?