This repository contains code, data, and experiment pipelines for a multitask benchmark suite targeting six distinct cognitive tasks, along with supplementary data and code for auxiliary or exploratory experiments.
Each of the following six tasks has its own top-level folder:
imageCaptioningβ Task 1: Image CaptioningwordAssociationβ Task 2: Word Associationconversationβ Task 3: ConversationcolorDetectionβ Task 4: Color DetectionobjectDetectionβ Task 5: Object Detectionattentionpredictionβ Task 6: Attention Prediction
Each task folder contains:
- Computational or MTurk experiment folders (non-
Plot): Implementation of specific experiments Plotfolder: Code for result compilation, statistical analysis, and figure plotting
π Note: In Plot folders, Jupyter notebooks follow this execution order:
TaskN_PreCompileData.ipynb β TaskN_Run1_*.ipynb β TaskN_Run2_*.ipynb β ...
Always start with PreCompileData.ipynb before running RunX_Y notebooks in order.
-
MiscellaneousDataβ Contains:- Data for simple parsing tasks (e.g., text stimulus parsing, result parsing)
- Example inputs/outputs from computational runs
-
MiscellaneousCodeβ Contains:- Scripts for:
- Machine agent experiments (e.g., caption generation using LLaVa, Flamingo, ChatGPT-generated scanpaths)
- Zero-shot machine judge experiments (e.g., ChatGPT as a zero-shot judge across the six tasks)
- Scripts for:
- Generating image captions with new captioning models (e.g., LLaVa, Flamingo)
- Using ChatGPT to generate captions and visual scanpaths
- Deploying ChatGPT as a judge across all six tasks in zero-shot settings
- Parsing text stimuli
- Extracting computational result tables
- Navigate to the task of interest (
imageCaptioning,conversation, etc.) - Enter the
Plotsubfolder - Run the notebooks in order, starting from
TaskN_PreCompileData.ipynb - Proceed sequentially with
TaskN_RunX_Y.ipynbnotebooks as they build upon each other
For any questions, please reach out to the maintainers or contributors of this repository.