ifc-bench 🏗️💡

This repository is being superseded by V2 of IFC-Bench, which is hosted on Hugging Face: https://huggingface.co/datasets/sylvainhellin/ifc-bench (this is due to the free tier having much more generous rate limits for hosting large datasets). V2 contains all the questions and models from V1 (this archived repository), as well as many more models and questions (21 BIM projects with 37 IFC models across architectural, structural, MEP and speciality disciplines, as well as 1,027 question-answer pairs covering diverse BIM information retrieval tasks). Therefore, I would recommend using V2 from now on.

A benchmark dataset for evaluating BIM (Building Information Modeling) comprehension and reasoning capabilities in AI systems. Provides curated IFC models with question-answer pairs for testing BIM-related AI implementations.

Dataset snapshot:

	question	answer	ifc_model	project
0	What is the total gross floor area of the buil...	The total gross floor area of the building is ...	arc	duplex
1	What is the height of the ceiling in room A203?	The height of the ceiling in room A203 is 2.58 m	arc	duplex
2	Give me the name of all the rooms in the build...	The list of all the rooms in the building is: ...	arc	duplex
3	How many windows are there on the north facade?	I cannot calculate the number of window on th...	arc	duplex
4	What is the width of the door 1hOSvn6df7F8_7Gc...	The width of the door is 1.25 m	arc	duplex

Features

Versioned datasets: Currently at V1 with 2 BIM models and 105 QA pairs
Diverse question types:
- Spatial reasoning
- Element properties
- System relationships
- Construction sequencing
Rich contextual data:
- Original IFC files
- Model snapshots
- Architectural descriptions
- License documentation
Machine-readable format: CSV dataset with clear column structure

Dataset Structure

ifc-bench/
├── projects/                  # Directory for all projects
│   ├── duplex/                # First project
│   │   ├── arc.ifc            # Architecture model
│   │   ├── mep.ifc            # MEP model
│   │   ├── license.txt        # Project license
│   │   ├── model_card.csv     # Project metadata
│   │   └── snapshot.png       # Visual snapshot
│   └── dental_clinic/         # Second project
│       ├── arc.ifc            # Architecture model
│       ├── str.ifc            # Structural model
│       ├── mep.ifc            # MEP model
│       └── ...                # Other project files
├── questions/                  # Question-answer pairs
│   └── ifc-bench-v1.csv       # Primary dataset
└── docs/                      # Supplementary materials
    └── CONTRIBUTING.md        # Contribution guidelines

Models Overview

🏠 Duplex Model

Disciplines: Architectural, MEP
License: CC-BY-4.0
Complexity: Simple
Source: buildingSMART Sample Files

🏥 Dental Clinic

Disciplines: Architectural, Structural, MEP
License: CC-BY-4.0
Complexity: Intermediate
Source: buildingSMART Sample Files

Getting Started

Prerequisites

Python 3.8+
pandas (for data analysis)
ifcopenshell (optional, for working with IFC files)

Install requirements:

pip install pandas ifcopenshell

Quick Start

git clone https://github.com/sylvainHellin/ifc-bench.git
cd ifc-bench

Using the Dataset

import pandas as pd

# Load dataset
df = pd.read_csv('questions/ifc-bench-v1.csv')

# Explore questions by model
duplex_questions = df[df['ifc_model'] == 'duplex']
print(f"Duplex model has {len(duplex_questions)} questions")

# Sample question format
sample_q = df.iloc[0]
print(f"""
Question: {sample_q.question}
Answer: {sample_q.answer}
Model: {sample_q.ifc_model}
Project: {sample_q.project}
""")

Dataset Columns

Column	Description	Example
`question`	Natural language question	"What is the total gross floor area of the building?"
`answer`	Ground truth answer	"The total gross floor area of the building is 354.67 sqm"
`ifc_model`	Model identifier	"arc"
`project`	Question category	"duplex"

Dataset Integrity

Verify dataset integrity using SHA-256 checksum:

shasum -a 256 questions/ifc-bench-v1.csv
# Expected output: f67a48770d74b6e0ff0868c923c3e1d976110350b2c439564d7ceccc16a46f35

Contributing

We welcome contributions through:

🆕 New IFC models (with permissive licensing)
➕ Additional QA pairs for existing models
✏️ Documentation improvements
🐛 Error corrections in existing answers

Please see our Contribution Guidelines for details.

License

Dataset: Licensed under CC BY 4.0
Models: Inherit their original licenses (see individual model folders)

Citation

If using in research, please cite:

@misc{ifc-bench,
  title = {{ifc-bench}: {BIM} Comprehension \& Reasoning Benchmark Dataset},
  author = {Sylvain Hellin},
  year = {2024},
  url = {https://github.com/sylvainHellin/ifc-bench},
  note = {Version 1.0}
}

Acknowledgments

Special thanks to:

buildingSMART International for providing sample files
The openBIM community for quality assurance
Early adopters for feedback and validation

📌 Maintainer: Sylvain Hellin | 📧 Contact: sylvain.hellin@tum.de | 🐛 Issue Tracker: GitHub Issues

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docs		docs
projects		projects
questions		questions
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ifc-bench 🏗️💡

Table of Contents

Features

Dataset Structure

Models Overview

🏠 Duplex Model

🏥 Dental Clinic

Getting Started

Prerequisites

Quick Start

Using the Dataset

Dataset Columns

Dataset Integrity

Contributing

License

Citation

Acknowledgments

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

ifc-bench 🏗️💡

Table of Contents

Features

Dataset Structure

Models Overview

🏠 Duplex Model

🏥 Dental Clinic

Getting Started

Prerequisites

Quick Start

Using the Dataset

Dataset Columns

Dataset Integrity

Contributing

License

Citation

Acknowledgments

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Packages