Skip to content
This repository was archived by the owner on Jan 25, 2026. It is now read-only.
/ SexTok Public archive

SexTok: Multi-modal dataset for distinguishing sex education from sexually suggestive content on TikTok | ACL 2023 Findings

Notifications You must be signed in to change notification settings

beingenfa/SexTok

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

SexTok: Multi-Modal Dataset for Sex Education vs. Suggestive Content Classification

Paper @ ACL Findings 2023 | 🏆 Leaderboard | Data | Workshop on Online Harms and Abuse | Video


Overview

It’s not Sexually Suggestive; It’s Educative | Separating Sex Education from Suggestive Content on TikTok videos (George & Surdeanu, Findings 2023)

SexTok is a multi-modal dataset of 1000 TikTok videos addressing the challenge of distinguishing between sexually suggestive content and sex education videos. The dataset includes: three class labels: Sexually Suggestive (20%), Sex Education (20%), Others (60%), audio transcriptions using OpenAI Whisper and gender expression annotations for bias evaluation

Example

Example

Leaderboard

Performance on SexTok test set, sorted by Macro F1:

Model Accuracy Macro F1 Source
🏆 Consensus-Aware Balance Learning (Zhou et al.) 86% 84% Zhou et al.
SlowFast 80% 76% Zhou et al.
ResNet 77% 67% Zhou et al.
TimeSformer 75% 68% Zhou et al.
Uniformer 74% 68% Zhou et al.
VideoMAE 70% 61% George et al.
BERT (Transcription) 68% 64% George et al.

📧If you’ve used SexTok in your work, and would like to be added to the list above, please email us! Contact Info below

Citation

@inproceedings{george-surdeanu-2023-sexually,
   title = "It{'}s not Sexually Suggestive; It{'}s Educative | Separating Sex Education from Suggestive Content on {T}ik{T}ok videos",
   author = "George, Enfa  and
     Surdeanu, Mihai",
   editor = "Rogers, Anna  and
     Boyd-Graber, Jordan  and
     Okazaki, Naoaki",
   booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
   month = jul,
   year = "2023",
   address = "Toronto, Canada",
   publisher = "Association for Computational Linguistics",
   url = "https://aclanthology.org/2023.findings-acl.365/",
   doi = "10.18653/v1/2023.findings-acl.365",
   pages = "5904--5915",
}

Contact

enfafane <\a> gmail.com

About

SexTok: Multi-modal dataset for distinguishing sex education from sexually suggestive content on TikTok | ACL 2023 Findings

Topics

Resources

Stars

Watchers

Forks