How can we evaluate the performance of the model based on the subset of EventGPT dataset?
How can we evaluate the performance of the model based on the subset of EventGPT dataset?