Skip to content

Latest commit

 

History

History
42 lines (23 loc) · 2.28 KB

File metadata and controls

42 lines (23 loc) · 2.28 KB

coreference-context-scope

Data and code for the paper "Coreference as an indicator of context scope in multimodal narrative" (GEM2 @ ACL 2025).

Environment

To run the scripts, you need only some basic packages. They are listed under environment.txt.

Data

The subset of the VWP dataset used in this research is available as data/vwp-gem2-subset.csv.

Model-generated stories are available under data/model-generated-stories. Each .parquetfile also contains the corresponding human-generated story.

The input files to run LinkAppend coreference system are available under data/link-append/in. We used implementation of the system available at https://github.com/ianporada/coref-reeval. The outputs of the LinkAppend runs are available under data/link-append/out.

Prompts are available under data/prompts. Data on which character appears on which image in all stories is available under data/visual-continuityin the form of several .csv files.

In all data files story_id column is what links extracted stories with original stories from VWP.

General statistics + character metrics

To compute general descriptive statistics for both machine-/ and human-generated texts alongside quantitative metrics, run

python main.py --results-path ../data/link-append/out/ --output-path ../results/metrics/ --character-stories ../data/character_stories.json

Multimodal character continuity metric

To compute MCC metric, run

python mcc_metric.py --output-path ../results/mcc

Other

To compute correlation between character change metric and MCC (Table 4 in the paper), run

python correlate_metrics.py --output-path ../results/correlation --character-stories ../data/character_stories.json

Citation

If you find our data useful, please cite

Nikolai Ilinykh, Shalom Lappin, Asad B. Sayeed, and Sharid Loáiciga. 2025. Coreference as an indicator of context scope in multimodal narrative. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 789–807, Vienna, Austria and virtual meeting. Association for Computational Linguistics.

The poster presented at GEM2 can be found here.