Spatio-Temporal Landmark Detection via Selective Fine-Tuning of Echocardiography Foundation Models (NLDL 2026)

This repository accompanies the paper:

Preetraj Bhoodoo, Sarina Thomas, Elisabeth Wetzer, Anne Solberg, Guy Ben-Yosef.
Spatio-Temporal Landmark Detection via Selective Fine-Tuning of Echocardiography Foundation Models.
Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), PMLR 307, 2026.

Summary

We investigate whether modern video-based echocardiography foundation models can be adapted to precise spatio-temporal landmark detection (LV contour landmarks at end-diastole (ED) and end-systole (ES)) without extensive fine-tuning. We evaluate two strong encoders (EchoPrime and PanEcho) on EchoNet-Dynamic, and compare:

Encoder regimes: frozen vs. selective unfreezing vs. full fine-tuning
Decoder heads: MLP vs. graph-based (GCN) decoding
Baselines: ResNet-18 (2D/3D), ViT-Base, MViTv2-Small

A key finding is that selectively unfreezing only the last few blocks can recover most of the performance of full fine-tuning, especially when paired with a GCN head and augmentation.

Main architecture

Below is the high-level pipeline (encoder + landmark decoder head) used in the paper.

Figure: Overview of the experiment setup (sampled 16-frame clip from ED→ES → FM encoder → MLP/GCN → ED/ES landmarks).

Data

Experiments use EchoNet-Dynamic (apical-4-chamber echocardiography videos with ED/ES LV contour annotations).

1. Download EchoNet-Dynamic

Request access and download from the official source:
https://echonet.github.io/dynamic/

After downloading, your dataset directory should contain:

/path/to/echonet-dynamic/
    FileList.csv            # per-video metadata (split, EF, ESV, EDV)
    VolumeTracings.csv      # LV contour trace annotations per frame
    Videos/
        0X1A0A263B22CCD966.avi
        ...

2. Preprocess the dataset

Run the preprocessing script from the repo root. This extracts ED/ES frames, resizes them to 112×112, converts the contour tracings to 40 keypoints, and saves per-cycle .npy/.npz files used by the data loader:

python data/preprocess_echonet.py \
    --input_dir  /path/to/echonet-dynamic \
    --output_dir /path/to/echonet-dynamic/preprocessed \
    --save_files data/files/filenames

Output structure after preprocessing:

preprocessed/
  40/
    frames/                     # individual ED/ES frames as PNG (112x112x3)
      <ID>_<frame>.png
    annotations/                # per-frame keypoints and masks
      <ID>_<frame>.npz          # keys: 'kpts' (40,2), 'mask' (112,112), 'ef'
    cycle/
      frames/                   # full video clips as numpy arrays
        <ID>.npy                # shape: (3, num_frames, H, W), uint8
      annotations/              # per-cycle annotations
        <ID>.npz                # keys: 'kpts' (2,40,2), 'fnum', 'ef', 'vol1', 'vol2'

The script also writes filename list .txt files to data/files/filenames/cyclic/, which are used by the data loaders to define train/val/test splits.

3. (Optional) Create data subsets

To reproduce experiments with reduced training data (0.5%, 1%, 2%, 10%, 25%, 50%), generate subset filename lists:

python data/dataset_split.py --base-dir data/files/filenames/cyclic --seed 42

This creates files such as echonet_cycle_train_10_filenames.txt alongside the full split files, used when setting dataset: EchoNet_10 (etc.) in the config.

Usage

Set dataset_folder in your config YAML to the root of your EchoNet-Dynamic download (e.g. /path/to/echonet-dynamic). The data loader expects the preprocessed/ subfolder to exist at that path.

Train:

python tools/train_landmarks.py --cfg configs/resnet18_echonet.yaml

Eval:

python tools/eval_landmarks.py --model_checkpoint /path/to/experiment_folder

License

The paper is open-access under CC BY 4.0.
Code licensing will be specified upon release.

Citation

If you use this work, please cite:

@inproceedings{bhoodoo2026echovlmlandmarks,
  title     = {Spatio-Temporal Landmark Detection via Selective Fine-Tuning of Echocardiography Foundation Models},
  author    = {Bhoodoo, Preetraj and Thomas, Sarina and Wetzer, Elisabeth and Solberg, Anne and Ben-Yosef, Guy},
  booktitle = {Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL)},
  series    = {Proceedings of Machine Learning Research (PMLR)},
  volume    = {307},
  year      = {2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
data		data
figures		figures
models		models
tools		tools
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spatio-Temporal Landmark Detection via Selective Fine-Tuning of Echocardiography Foundation Models (NLDL 2026)

Summary

Main architecture

Data

1. Download EchoNet-Dynamic

2. Preprocess the dataset

3. (Optional) Create data subsets

Usage

Train:

Eval:

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spatio-Temporal Landmark Detection via Selective Fine-Tuning of Echocardiography Foundation Models (NLDL 2026)

Summary

Main architecture

Data

1. Download EchoNet-Dynamic

2. Preprocess the dataset

3. (Optional) Create data subsets

Usage

Train:

Eval:

License

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages