Sage Image Search Benchmarks

This repository contains the code and configuration for creating image retrieval benchmarks for Sage Image Search using the imsearch_benchmaker framework. It also contains other datasets that we use to benchmark text-to-image retrieval systems in various scientific domains.

Overview

This repository provides tools and pipelines to create standardized benchmark datasets for evaluating text-to-image retrieval systems in various scientific domains. Each benchmark follows a consistent pipeline architecture that automates the entire dataset creation process, from raw image collection to publication on Hugging Face. It also contains other datasets that we use to benchmark text-to-image retrieval systems in various scientific domains.

Datasets

Dataset	Domain	Description	Final Dataset	Code
FireBench	Fire Science 🔥	A benchmark dataset for evaluating text-to-image retrieval systems in the domain of fire science.	FireBench on Hugging Face	FireBenchMaker
CommonObjectsBench	General Objects & Scenes 🌍	A benchmark dataset for evaluating text-to-image retrieval systems on general objects and common scenes.	CommonObjectsBench on Hugging Face	CommonObjectsBenchMaker
CloudBench	Nephology(Atmospheric Science) 🌥	A benchmark dataset for evaluating text-to-image retrieval systems in the domain of Atmospheric Science specifically focused on clouds.	CloudBench on Hugging Face	CloudBenchMaker
Inquire	Biology 🌿	A benchmark dataset for evaluating text-to-image retrieval systems in the domain of biology.	INQUIRE-Benchmark-small on Hugging Face	Inquire
SageBench	Sage Continuum 🌲	A benchmark dataset for evaluating text-to-image retrieval systems on Sage Continuum sensor images when queries reference Sage metadata (vsn, zone, host, job, plugin, camera, project, address).	SageBench on Hugging Face	SageBenchMaker

Framework

All benchmarks in this repository use the imsearch_benchmaker framework, which provides:

Automated pipeline execution (preprocessing → annotation → query planning → judging → postprocessing)
Integration with adapters for vision annotation and query generation (OpenAI, Google, etc.)
Adapters for similarity scoring (apple/DFN5B-CLIP-ViT-H-14-378)
Hugging Face dataset preparation and upload
Exploratory data analysis tools

For detailed instructions, see the individual benchmark README files.

Repository Structure

imsearch_benchmarks/
├── README.md                 # This file
├── docker/                   # Docker config to run the pipeline in a container
├── FireBenchMaker/          # FireBench benchmark
│   ├── README.md            # FireBench documentation
│   ├── config.toml          # Pipeline configuration
│   ├── dataset_card.md      # Dataset card for Hugging Face
│   ├── requirements.txt     # Python dependencies
│   ├── tools/               # Data collection scripts
│   │   ├── get_figlib.py
│   │   ├── get_sage.py
│   │   └── get_wildfire.py
│   └── ...
└── ...

Contributing

To add a new benchmark:

Create a new directory for your benchmark
Set up config.toml following the imsearch_benchmaker configuration format
Add data collection tools if needed
Create a README.md documenting your benchmark
If needed, add a new adapter for your benchmark

imsearch_benchmarks + imsearch_eval + imsearch_benchmaker

You can use imsearch_benchmarks with imsearch_eval to provide imsearch_eval with a set of benchmarks to evaluate the performance of the image search system. If you need to create a new benchmark, you can use the imsearch_benchmaker framework to create a new benchmark.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sage Image Search Benchmarks

Overview

Datasets

Framework

Repository Structure

Contributing

imsearch_benchmarks + imsearch_eval + imsearch_benchmaker

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
CloudBenchMaker		CloudBenchMaker
CommonObjectsBenchMaker		CommonObjectsBenchMaker
FireBenchMaker		FireBenchMaker
Inquire		Inquire
SageBenchMaker		SageBenchMaker
docker		docker
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Sage Image Search Benchmarks

Overview

Datasets

Framework

Repository Structure

Contributing

imsearch_benchmarks + imsearch_eval + imsearch_benchmaker

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages