Authors: Ryuto Yashiro, Masataka Sawayama, Ayumu Yamashita, & Kaoru Amano
This repository provides the code for the paper "Probing the content of semantic representations in body-selective regions".
To install latest development version :
git clone https://github.com/amano-k-lab/interpret_semrep.git
cd interpret_semrep
pip install -e .
We provide version information for each library used in our analyses in pyproject.toml, but you can use different versions depending on your environment.
Before running any scripts, you need to tell the code where your data lives by setting three environment variables: NSD_ROOT, DATA_ROOT, and RESULTS_ROOT.
Copy the provided template to create your own local configuration file:
cp .env.example .env
Then open .env and fill in your own paths:
NSD_ROOT=/path/to/NSD
DATA_ROOT=/path/to/data
RESULTS_ROOT=/path/to/results
NSD_ROOT: Root directory of the Natural Scenes Dataset (the folder containingnsddata,nsddata_betas, and related NSD files).DATA_ROOT: Root directory for project input resources that are not part of the NSD release (e.g.,bev_expbehavioral data and derived feature inputs).RESULTS_ROOT: Root directory where outputs generated by scripts (correlation maps, model outputs, figures, and summary files) are saved.
Several resources are required to run scripts in this repository.
All analyses in our paper center around this large-scale fMRI dataset (Allen et al, 2021). You can get access to the NSD via the form avaiable here.
After you download the NSD in your local environment, please set NSD_ROOT in your .env file accordingly.
We use the Grounded-Segment-Anything (Ren et al, 2024) to detect and segment people and faces in the NSD images. You need to install it from their GitHub repository.
We publicly provide the data from our behavioral experiment (implied motion judgment task), including the indices of the NSD images used in the experiment and text files containing behavioral responses, available at figshare.
We recommend downloading the data to DATA_ROOT/bev_exp on your local machine. You can set DATA_ROOT in your .env file.
A set of Freesurfer surface files and an .svg file containing ROI information are required to visualize flattened cortical surface maps using pycortex.
We used surface files provided by a previous study (Wang et al, 2023; repository, files).
We make the .svg files publicly avaiable at figshare. We recommend downloading them and placing them in /usr/local/share/pycortex/ according to the official pycortex documentation.
The scripts folder contains code to run all analyses. The following subsections list the scripts required for each analysis step. When multiple scripts are provided, they should be run sequentially in the order shown.
| Script | Description |
|---|---|
scripts/save_betas.py |
Builds ROI masks and extracts z-scored NSD surface betas for target regions. |
scripts/save_embeddings.py |
Extracts NSD image captions and encodes them into sentence embeddings using all-mpnet-base-v2. |
scripts/train_encmodel.py |
Trains subject-specific encoding models mapping caption embeddings to brain betas for a given subject and ROI, and evaluates prediction accuracy on a held-out test set. |
| Script | Description |
|---|---|
scripts/perform_co_occur_analysis.py |
Constructs caption co-occurrence matrices from NSD images ranked by fMRI response, performs NMF decomposition with BIC-based component selection, and visualizes the resulting semantic components. |
| Script | Description |
|---|---|
scripts/grounded_sam_nsd.py |
Generates semantic segmentation masks for persons and faces in NSD images using Grounding DINO and SAM, saving mask data and metadata for downstream use. |
scripts/perform_corr_analysis.py |
Computes vertex-wise correlations between NSD brain responses and image/behavioral features, runs permutation-based significance tests, and reports FDR-corrected p-values and ratios of significant vertices within each region. |
| Script | Description |
|---|---|
scripts/perform_vp_analysis.py |
Decomposes NSD brain response variance into unique and shared components across three features (implied motion, number of people and body size) using variance partitioning, with permutation-based significance testing. |
| Script | Description |
|---|---|
scripts/make_figures.py |
Generates publication figures including cortical surface maps of correlations and variance partitioning, behavioral rating plots, and inter-feature correlation matrices. |
This project is licensed under the MIT License.