Request for intermediate data in the dataset construction

Thanks for your excellent work! 

I noticed that the dataset released on Hugging Face mainly contains the final training annotations (e.g., reasoning traces and tool-augmented sequences). I am particularly interested in better understanding the data construction pipeline, especially the intermediate stages. 

Specifically, I am wondering:
1. Are the video clip-level captions used during dataset construction available?
2. Is there access to the ground-truth temporal annotations (i.e., start/end timestamps of relevant segments)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for intermediate data in the dataset construction #20

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request for intermediate data in the dataset construction #20

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions