Skip to content

amano-k-lab/interpret_semrep

Repository files navigation

Probing the content of semantic representations in body-selective regions

Authors: Ryuto Yashiro, Masataka Sawayama, Ayumu Yamashita, & Kaoru Amano

This repository provides the code for the paper "Probing the content of semantic representations in body-selective regions".

Installation

To install latest development version :

git clone https://github.com/amano-k-lab/interpret_semrep.git
cd interpret_semrep
pip install -e .

We provide version information for each library used in our analyses in pyproject.toml, but you can use different versions depending on your environment.

Configuration

Before running any scripts, you need to tell the code where your data lives by setting three environment variables: NSD_ROOT, DATA_ROOT, and RESULTS_ROOT.

Copy the provided template to create your own local configuration file:

cp .env.example .env

Then open .env and fill in your own paths:

NSD_ROOT=/path/to/NSD
DATA_ROOT=/path/to/data
RESULTS_ROOT=/path/to/results
  • NSD_ROOT: Root directory of the Natural Scenes Dataset (the folder containing nsddata, nsddata_betas, and related NSD files).
  • DATA_ROOT: Root directory for project input resources that are not part of the NSD release (e.g., bev_exp behavioral data and derived feature inputs).
  • RESULTS_ROOT: Root directory where outputs generated by scripts (correlation maps, model outputs, figures, and summary files) are saved.

Downloading necessary resources

Several resources are required to run scripts in this repository.

Natural Scenes Dataset (NSD)

All analyses in our paper center around this large-scale fMRI dataset (Allen et al, 2021). You can get access to the NSD via the form avaiable here.

After you download the NSD in your local environment, please set NSD_ROOT in your .env file accordingly.

Grounded Segment Anything

We use the Grounded-Segment-Anything (Ren et al, 2024) to detect and segment people and faces in the NSD images. You need to install it from their GitHub repository.

Behavioral data

We publicly provide the data from our behavioral experiment (implied motion judgment task), including the indices of the NSD images used in the experiment and text files containing behavioral responses, available at figshare.

We recommend downloading the data to DATA_ROOT/bev_exp on your local machine. You can set DATA_ROOT in your .env file.

Freesurfer surface data

A set of Freesurfer surface files and an .svg file containing ROI information are required to visualize flattened cortical surface maps using pycortex.

We used surface files provided by a previous study (Wang et al, 2023; repository, files).

We make the .svg files publicly avaiable at figshare. We recommend downloading them and placing them in /usr/local/share/pycortex/ according to the official pycortex documentation.

Reproducing paper results

The scripts folder contains code to run all analyses. The following subsections list the scripts required for each analysis step. When multiple scripts are provided, they should be run sequentially in the order shown.

Training encoding models

Script Description
scripts/save_betas.py Builds ROI masks and extracts z-scored NSD surface betas for target regions.
scripts/save_embeddings.py Extracts NSD image captions and encodes them into sentence embeddings using all-mpnet-base-v2.
scripts/train_encmodel.py Trains subject-specific encoding models mapping caption embeddings to brain betas for a given subject and ROI, and evaluates prediction accuracy on a held-out test set.

Co-occurrence analysis

Script Description
scripts/perform_co_occur_analysis.py Constructs caption co-occurrence matrices from NSD images ranked by fMRI response, performs NMF decomposition with BIC-based component selection, and visualizes the resulting semantic components.

Correlation analysis

Script Description
scripts/grounded_sam_nsd.py Generates semantic segmentation masks for persons and faces in NSD images using Grounding DINO and SAM, saving mask data and metadata for downstream use.
scripts/perform_corr_analysis.py Computes vertex-wise correlations between NSD brain responses and image/behavioral features, runs permutation-based significance tests, and reports FDR-corrected p-values and ratios of significant vertices within each region.

Variance partitioning

Script Description
scripts/perform_vp_analysis.py Decomposes NSD brain response variance into unique and shared components across three features (implied motion, number of people and body size) using variance partitioning, with permutation-based significance testing.

Visualization

Script Description
scripts/make_figures.py Generates publication figures including cortical surface maps of correlations and variance partitioning, behavioral rating plots, and inter-feature correlation matrices.

Citation

License

This project is licensed under the MIT License.

About

Code for the analyses in the paper "Probing the content of semantic representations in body-selective regions".

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages