SA-IQA

Official release for “Beyond Pixels: Benchmarking and Reward-Based Assessing Framework for Visual Spatial Aesthetics.”

Project documentation is being updated.

SA-IQA evaluates the spatial aesthetics of interior images along four dimensions:

distortion
harmony
layout
lighting

This repository contains the SA-IQA code. The released model bundle and SA-BENCH dataset are hosted on Hugging Face:

Model bundle: AliKunYu/SA-IQA-model
Dataset: AliKunYu/SA-BENCH

Repository Structure

SA-IQA/
├── LICENSE
├── README.md
├── requirements.txt
├── tools/
│   ├── convert_csv_to_jsonl.py
│   ├── evaluate_correlation.py
│   ├── infer.py
│   ├── local_progress.py
│   ├── prompt_configs.py
│   └── train_sft.sh
├── SA-BENCH/                 # Downloaded from AliKunYu/SA-BENCH
│   ├── LICENSE
│   ├── README.md
│   ├── annotations/
│   └── images/
└── SA-IQA-model/             # Downloaded from AliKunYu/SA-IQA-model
    ├── LICENSE
    ├── README.md
    ├── Ovis2.5-9B/
    └── sa-iqa-prompt4/

Installation

conda create -n sa-iqa python=3.10 -y
conda activate sa-iqa
pip install -r requirements.txt

requirements.txt contains inference, scoring, evaluation, and data-conversion dependencies. It installs torch==2.5.1+cu124, the matching torchvision/torchaudio wheels, flash-attn==2.7.4.post1, and pins ms-swift to the 3.x API because the current inference code uses swift.llm.

Training with tools/train_sft.sh additionally requires a CUDA-compatible training stack such as DeepSpeed.

Download Model and Dataset

Install the Hugging Face Hub CLI if it is not already available:

pip install -U huggingface_hub

From the repository root, download the model bundle and dataset into the expected local directory names:

hf download AliKunYu/SA-IQA-model \
  --repo-type model \
  --local-dir SA-IQA-model

hf download AliKunYu/SA-BENCH \
  --repo-type dataset \
  --local-dir SA-BENCH

The local layout should be:

SA-IQA/
├── SA-IQA-model/
│   ├── Ovis2.5-9B/
│   └── sa-iqa-prompt4/
└── SA-BENCH/
    ├── annotations/
    └── images/

Keep these directory names if you want to use the default script arguments. tools/infer.py loads ./SA-IQA-model/sa-iqa-prompt4 by default for prompt 4, and tools/train_sft.sh uses ./SA-IQA-model/Ovis2.5-9B as the default base model.

If you download the files to another location, pass explicit paths through --model_path, --data_root, or --model.

Inference

Run inference with the released sa-iqa-prompt4 model:

python tools/infer.py --prompt_version 4 --mode all --dimension lighting

This command uses:

model: ./SA-IQA-model/sa-iqa-prompt4
data: ./SA-BENCH
test split: ./SA-BENCH/annotations/lighting_3k_test_prompt4.jsonl
output: ./results/lighting_prompt4_sa-iqa-prompt4.jsonl and ./results/lighting_prompt4_sa-iqa-prompt4.csv

Run all four dimensions:

python tools/infer.py --prompt_version 4 --mode all --dimension distortion
python tools/infer.py --prompt_version 4 --mode all --dimension harmony
python tools/infer.py --prompt_version 4 --mode all --dimension layout
python tools/infer.py --prompt_version 4 --mode all --dimension lighting

By default, outputs are written under results/.

For a quick smoke test, reduce the inference batch size if GPU memory is limited:

python tools/infer.py --prompt_version 4 --mode all --dimension lighting --batch_size 1 --max_batch_size 1

Step-By-Step Modes

Inference only:

python tools/infer.py --prompt_version 4 --mode infer --dimension lighting

Score conversion only:

python tools/infer.py --prompt_version 4 --mode score --dimension lighting

Per-dimension evaluation only:

python tools/infer.py --prompt_version 4 --mode eval --dimension lighting

Overall evaluation across all four dimensions:

python tools/evaluate_correlation.py --prompt_version 4

Training

Use the unified training entrypoint:

bash tools/train_sft.sh --prompt_version 4

The default training base model path is ./SA-IQA-model/Ovis2.5-9B, so download AliKunYu/SA-IQA-model as the full bundle before training. The default output path for prompt 4 is ./SA-IQA-model/sa-iqa-prompt4.

Prompt1, prompt2, and prompt3 are available through --prompt_version for comparison and ablation. Prompt4 is the recommended setting and corresponds to the released final model.

Data Conversion

tools/convert_csv_to_jsonl.py converts annotation CSV files into prompt-based JSONL files for training or evaluation.

Licenses

Code in this repository root and tools/ is licensed under the Apache License 2.0. See LICENSE.
SA-BENCH/ is licensed under the Apache License 2.0. See SA-BENCH/LICENSE.
SA-IQA-model/ is licensed under the Apache License 2.0. See SA-IQA-model/LICENSE.
SA-IQA-model/Ovis2.5-9B/ is the bundled base model copy and retains its original Apache License 2.0 license and notice files.

The repository-level CITATION.cff license field applies to the code release. Dataset and model usage are governed by their own license files.

Citation

If you find this project useful, please cite:

@inproceedings{gao2025beyond,
  title={Beyond Pixels: Benchmarking and Reward-Based Assessing Framework for Visual Spatial Aesthetics},
  author={Gao, Yuan and Song, Jin and Fei, Yiyun and Li, Gongzhe and Yang, Ruigao},
  booktitle={CVPR 2025 Workshop},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SA-IQA

Repository Structure

Installation

Download Model and Dataset

Inference

Step-By-Step Modes

Training

Data Conversion

Licenses

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
SA-BENCH		SA-BENCH
SA-IQA-model		SA-IQA-model
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SA-IQA

Repository Structure

Installation

Download Model and Dataset

Inference

Step-By-Step Modes

Training

Data Conversion

Licenses

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages