-
Notifications
You must be signed in to change notification settings - Fork 6
[kaz] Kazakh "small" data is mis-sized #2
Copy link
Copy link
Open
Description
Hello from the future @jkodner05 and team. I noticed that the file part1/suprise_languages/kaz_small.train is the same size (7,000 exemplars) as kaz_large.train in that directory.
- I assume this is in error. Can you confirm?
- What is the best way to evaluate/compare to prior results in the lower-resource ("small") setting given this finding? Should I just use
kaz_small.trainas training data in the lower-resource setting, even though it's not "small" (700 examples) like, say,hye_small.train?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels