Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
186 changes: 186 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# MultiBench AI Coding Agent Instructions

## Project Overview

MultiBench is a standardized toolkit for multimodal deep learning research. It provides modular implementations of 20+ fusion methods across 15 datasets and 10 modalities (vision, audio, text, time-series, tabular). The architecture separates concerns into **encoders** (unimodal processing), **fusion** (combining modalities), **heads** (task prediction), and **training structures** (optimization strategies).

## Critical Working Directory Requirement

**All scripts must run from the repository root**. The codebase uses `sys.path.append(os.getcwd())` everywhere. Always execute commands from `/home/bagustris/github/multibench` or the workspace root.

## Architecture Patterns

### Standard Module Pipeline

Every multimodal experiment follows this pattern:

```python
# 1. Add repository to path (REQUIRED at top of every script)
import sys, os
sys.path.append(os.getcwd())

# 2. Get dataloaders (train, valid, test)
from datasets.{dataset}/get_data import get_dataloader
traindata, validdata, testdata = get_dataloader(data_path)

# 3. Build components
encoders = [EncoderModal1().cuda(), EncoderModal2().cuda()] # List order matches modality order
fusion = Concat().cuda() # Or TensorFusion, MVAE, etc.
head = MLP(fusion_dim, hidden, output_classes).cuda()

# 4. Train using training structure
from training_structures.Supervised_Learning import train, test
train(encoders, fusion, head, traindata, validdata, epochs=20)

# 5. Test saved model
model = torch.load('best.pt').cuda()
test(model, testdata)
```

### Module Locations

- **Encoders**: `unimodals/common_models.py` - LeNet, MLP, GRU, LSTM, Transformer, ResNet, etc.
- **Fusions**: `fusions/common_fusions.py` - Concat, TensorFusion, NLGate, LowRankTensorFusion
- **Training**: `training_structures/Supervised_Learning.py` - Main training loop with automatic complexity tracking
- **Objectives**: `objective_functions/` - Custom losses beyond CrossEntropy/MSE (MFM, MVAE, CCA, contrastive)
- **Dataloaders**: `datasets/{dataset}/get_data.py` - Always named `get_dataloader()`, returns `(train, valid, test)` tuple

### Dimension Matching Rules

1. **Encoder outputs** must match **fusion input expectations**
- `Concat()` expects list of tensors → outputs `sum(all_dims)`
- `TensorFusion()` with 2 modals of dims [d1, d2] → outputs `(d1+1)*(d2+1)`
2. **Fusion output** must match **head input**
- Example: `encoders=[LeNet(1,6,3), LeNet(1,6,5)]` → outputs `[48, 192]`
- `Concat()` → 240-dim → `head=MLP(240, 100, 10)`
3. **Always pass `.cuda()` modules to training** - no CPU support in examples

## Key Conventions

### Task Types
Specify via `task` parameter (default: `"classification"`):
- `"classification"` - CrossEntropyLoss, accuracy metrics
- `"regression"` - MSELoss
- `"multilabel"` - BCEWithLogitsLoss
- `"posneg-classification"` - Special affect dataset handling

### Custom Objectives

When using complex architectures (MFM, MVAE), provide additional modules and args:

```python
from objective_functions.objectives_for_supervised_learning import MFM_objective
from objective_functions.recon import sigmloss1dcentercrop

objective = MFM_objective(2.0, [sigmloss1dcentercrop(28,34), sigmloss1dcentercrop(112,130)], [1.0, 1.0])

train(encoders, fusion, head, traindata, validdata, 25,
additional_optimizing_modules=decoders+intermediates, # Modules to optimize beyond encoders/fusion/head
objective=objective,
objective_args_dict={'decoders': decoders, 'intermediates': intermediates}) # Extra args for objective
```

The training structure automatically passes `pred`, `truth`, `args` to objectives. Custom objectives receive:
- `args['model']` - The full MMDL wrapper
- `args['reps']` - Encoder outputs before fusion
- Plus any keys from `objective_args_dict`

### Robustness Evaluation

Robustness tests are integrated into standard `test()`. Pass `test_dataloaders_all` as dict:

```python
test_robust = {
'clean': [test_clean_dataloader],
'noisy_audio': [test_noise1, test_noise2, ...], # Multiple noise levels
'missing_vision': [test_missing_dataloader]
}
test(model, test_robust, dataset='avmnist', no_robust=False)
```

Auto-generates relative/effective robustness plots in working directory.

## Dataset Specifics

### Data Organization
- Download instructions per dataset in `datasets/{dataset}/README.md`
- Most datasets need manual download (restricted/large files)
- **MIMIC**: Restricted access - email yiweilyu@umich.edu with credentials
- **AV-MNIST**: Download tar from Google Drive, untar, pass directory path
- **Gentle Push**: Auto-downloads to `datasets/gentle_push/cache/` on first run

### Affect Datasets (MOSI, MOSEI, MUStARD, UR-FUNNY)

Special handling for variable-length sequences:

```python
# Packed sequences (default - recommended)
traindata, validdata, testdata = get_dataloader(pkl_path, data_type='mosi')
train(encoders, fusion, head, traindata, validdata, epochs, is_packed=True)

# Fixed-length padding (alternative)
traindata, validdata, testdata = get_dataloader(pkl_path, data_type='mosi', max_pad=True, max_seq_len=50)
train(encoders, fusion, head, traindata, validdata, epochs, is_packed=False)
```

## Testing & Development

### Running Tests
```bash
# From repository root only
pytest tests/ # Unit tests for modules
```

Tests use fixtures from `tests/common.py`. Mock data paths hardcoded to `/home/arav/MultiBench/MultiBench/` (ignore in tests).

### Tracking Complexity

Automatic by default (`track_complexity=True` in train). Prints:
- Training: peak memory, total params, runtime
- Testing: total params, runtime

Disable with `track_complexity=False`. Requires `memory-profiler` package.

### Model Checkpointing

- `train()` auto-saves best validation epoch to `best.pt` (or `save='custom.pt'`)
- Contains full `MMDL` wrapper with encoders, fusion, head
- Load with `model = torch.load('best.pt').cuda()`

## Common Gotchas

1. **Import errors**: Forgot `sys.path.append(os.getcwd())` at script top
2. **Dimension mismatch**: Encoder outputs don't sum to head input (check fusion output size)
3. **Wrong directory**: Running from subdirectory instead of repo root
4. **Missing `.cuda()`**: Models not moved to GPU before training
5. **Dataloader order**: Modalities must match encoder list order exactly
6. **Custom objectives**: Forgot `additional_optimizing_modules` for decoders/intermediates

## Adding New Components

### New Dataset
1. Create `datasets/{name}/get_data.py` with `get_dataloader(path, ...)` returning `(train, valid, test)`
2. Follow existing patterns (see `datasets/avmnist/get_data.py`)
3. Add example in `examples/{category}/{dataset}_{method}.py`

### New Fusion/Encoder
1. Add class to `fusions/common_fusions.py` or `unimodals/common_models.py`
2. Inherit from `nn.Module`, document input/output shapes in docstring
3. Test dimension flow: encoder outputs → fusion → head input

### New Objective
Return a closure that accepts `(pred, truth, args)`:

```python
def custom_objective(weight):
def actualfunc(pred, truth, args):
base_loss = torch.nn.CrossEntropyLoss()(pred, truth)
custom_term = compute_custom(args['model']) # Access via args dict
return base_loss + weight * custom_term
return actualfunc
```

## File Naming
- Example scripts: `{dataset}_{method}.py` (e.g., `mimic_baseline.py`, `avmnist_simple_late_fusion.py`)
- Saved models: `best.pt` (default), or custom via `save` parameter
- Always use `# noqa` after imports that depend on `sys.path.append(os.getcwd())`
48 changes: 48 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: Deploy Sphinx Docs

on:
push:
branches: [main]
workflow_dispatch:

permissions:
contents: read
pages: write
id-token: write

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.10'

- name: Install dependencies
run: |
pip install sphinx sphinx-rtd-theme
pip install -r requirements.txt

- name: Build Sphinx docs
run: |
cd sphinx
make html

- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: sphinx/build/html

deploy:
needs: build
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
28 changes: 14 additions & 14 deletions .github/workflows/workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,28 +8,28 @@ jobs:
os: [ubuntu-latest, macos-latest]
env:
OS: ${{ matrix.os }}
PYTHON: '3.7'
PYTHON: '3.10'
steps:
- uses: actions/checkout@master
- name: Setup Python
uses: actions/setup-python@master
with:
python-version: 3.7
python-version: '3.10'
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow is pinning to Python 3.10 but should specify a patch version for better reproducibility (e.g., '3.10.x' or '3.10.12'). Also consider testing against multiple Python versions if the library supports them, or document that only 3.10 is supported.

Copilot uses AI. Check for mistakes.
- name: Generate coverage report
run: |
pip install pytest
pip install pytest-cov
pip install -r requirements.txt
python -m pytest -s --cov-report xml --cov=utils --cov=unimodals/ --cov=training_structures/ --cov=robustness --cov=fusions/ --cov=objective_functions tests/test_*.py
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v2
with:
token: ${{ secrets.CODECOV_TOKEN }}
directory: ./coverage/reports/
env_vars: OS,PYTHON
fail_ci_if_error: true
files: ./coverage.xml,./coverage2.xml
flags: unittests
name: codecov-umbrella
path_to_write_report: ./coverage/codecov_report.txt
verbose: true
# - name: Upload coverage to Codecov
# uses: codecov/codecov-action@v2
# with:
# token: ${{ secrets.CODECOV_TOKEN }}
# directory: ./coverage/reports/
# env_vars: OS,PYTHON
# fail_ci_if_error: true
# files: ./coverage.xml,./coverage2.xml
# flags: unittests
# name: codecov-umbrella
# path_to_write_report: ./coverage/codecov_report.txt
# verbose: true
Comment on lines +24 to +35
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting out the codecov upload rather than removing it entirely creates clutter. If codecov is no longer needed, remove the commented code. If it's temporarily disabled, add a comment explaining why and when it will be re-enabled.

Copilot uses AI. Check for mistakes.
89 changes: 89 additions & 0 deletions datasets/affect/GLOVE_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# GloVe Loader - TorchText Replacement

## Overview

This module provides a replacement for the deprecated `torchtext.vocab.GloVe` functionality. It was created to address compatibility issues between torchtext and newer PyTorch versions.

## Why This Replacement?

The original MultiBench codebase used `torchtext` for loading GloVe word embeddings. However:

1. **torchtext is deprecated** - The library has undergone significant API changes and older versions are incompatible with modern PyTorch
2. **Installation issues** - torchtext can be difficult to install and has binary compatibility issues
3. **Minimal usage** - MultiBench only needs GloVe embeddings, not the full torchtext library

## Features

- **Automatic fallback**: The code automatically detects if torchtext is unavailable and uses this replacement
- **Compatible API**: Provides the same interface as the old `torchtext.vocab.GloVe` for backward compatibility
- **Caching**: Downloads and caches embeddings to `~/.cache/glove/` for reuse
- **Multiple corpora**: Supports different GloVe corpora (840B, 6B, 42B, twitter.27B)

## Usage

The replacement is automatically used when torchtext is not available. No code changes are needed in most cases.

### Direct Usage

```python
from datasets.affect.glove_loader import GloVe

# Load GloVe embeddings (840B corpus, 300 dimensions)
vec = GloVe(name='840B', dim=300)

# Get embeddings for tokens
tokens = ['hello', 'world']
embeddings = vec.get_vecs_by_tokens(tokens, lower_case_backup=True)
# Returns: torch.Tensor of shape (2, 300)
```

### Via Compatibility Layer

```python
from datasets.affect import glove_loader

# Use the vocab interface (compatible with old torchtext API)
vec = glove_loader.vocab.GloVe(name='840B', dim=300)
```

## Supported GloVe Corpora

- **840B**: 840 billion tokens, 300d vectors (default, ~2GB download)
- **6B**: 6 billion tokens, 300d vectors (~800MB download)
- **42B**: 42 billion tokens, 300d vectors (~1.5GB download)
- **twitter.27B**: 27 billion tweets, 200d vectors (~1.4GB download)

## Implementation Details

The loader:
1. Downloads GloVe embeddings from Stanford NLP servers on first use
2. Caches them locally to avoid re-downloading
3. Loads embeddings into memory as a PyTorch tensor
4. Provides fast lookup for word vectors
5. Returns zero vectors for unknown words

## Files Modified

The following files have been updated to use this replacement:
- `datasets/affect/get_data.py`
- `datasets/affect/get_raw_data.py`
- `deprecated/dataloaders/affect/get_data_robust.py`

## Network Requirements

The first time you use GloVe embeddings, you'll need internet access to download them. After that, they're cached locally and no network is needed.

## Troubleshooting

**Problem**: Download fails with network error
- **Solution**: Check internet connectivity. The Stanford NLP servers may be temporarily unavailable.

**Problem**: Out of memory when loading embeddings
- **Solution**: The 840B corpus requires ~6GB RAM. Use a smaller corpus like 6B if memory is limited.

**Problem**: Wrong embedding dimensions
- **Solution**: Ensure the `dim` parameter matches the corpus (e.g., 6B supports 50, 100, 200, 300; 840B only supports 300)

## License

This code is part of MultiBench and follows the same license. GloVe embeddings are provided by Stanford NLP under their own license terms.
8 changes: 7 additions & 1 deletion datasets/affect/get_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,14 @@
sys.path.append(os.getcwd())

import torch
import torchtext as text
from collections import defaultdict
try:
import torchtext as text
TORCHTEXT_AVAILABLE = True
except (ImportError, OSError):
# torchtext not available or incompatible, use our replacement
from . import glove_loader as text
TORCHTEXT_AVAILABLE = False
from torch.nn.utils.rnn import pad_sequence
from torch.utils.data import DataLoader, Dataset

Expand Down
Loading