Adding MediaSum dataset by AnnaWegmann · Pull Request #305 · CornellNLP/ConvoKit

AnnaWegmann · 2025-09-08T08:01:54Z

Description

This adds the mediasum.rst file for documentation and the convert_mediasum-corpus.ipnyb for the script that was used to convert the mediasum dataset to a convokit corpus object. Find the zipped dataset here: https://drive.google.com/file/d/1cCaSuVUKN0B3s-GxnWg1gtWwOLNF66n0/view?usp=sharing to be added to your servers

Motivation and Context

add a dataset, see details in the .rst file, but this is based on
https://aclanthology.org/2021.naacl-main.474.pdf and https://aclanthology.org/2024.emnlp-main.52/

How has this been tested?

see convert_mediasum-corpus.ipnyb for the creation / testing outputs

Other information

corpus still needs to be added to your servers https://drive.google.com/file/d/1cCaSuVUKN0B3s-GxnWg1gtWwOLNF66n0/view?usp=sharing

seanzhangkx8 · 2025-10-27T21:11:16Z

Hi Anna, thank you so much for your contribution to ConvoKit. It looks great. I will just add some configuration to support downloading the corpus from ConvoKit directly. After that I will merge the PR into our main branch. Thanks again for your work!

AnnaWegmann and others added 2 commits September 8, 2025 09:51

Adding MediaSum documentation

9e09204

add mediasum creation jupyter notebook

2d2d7cd

cristiandnm added the dataset Use this tag when providing a new dataset for inclusion in ConvoKit. label Sep 11, 2025

cristiandnm assigned seanzhangkx8 Sep 13, 2025

seanzhangkx8 added 2 commits October 27, 2025 17:24

add mediasum-corpus download config

e996abe

add documentation link

00bcbe5

seanzhangkx8 merged commit b243967 into CornellNLP:master Feb 3, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding MediaSum dataset#305

Adding MediaSum dataset#305
seanzhangkx8 merged 4 commits intoCornellNLP:masterfrom
AnnaWegmann:patch-1

AnnaWegmann commented Sep 8, 2025

Uh oh!

seanzhangkx8 commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AnnaWegmann commented Sep 8, 2025

Description

Motivation and Context

How has this been tested?

Other information

Uh oh!

seanzhangkx8 commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants