Skip to content

SadilKhan/MARVEL-FX3D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MARVEL-FX3D

Text-to-3D Textured Mesh Generation

Project Page Paper arXiv Dataset Explorer

Sankalp Sinha* · Mohammad Sadil Khan* · Muhammad Usama · Shino Sam · Didier Stricker · Sk Aziz Ali · Muhammad Zeshan Afzal

* Equal contribution

CVPR 2025

Overview

MARVEL-FX3D is the generation component of the MARVEL-40M+ paper, which introduces a dataset of 40M multi-level text annotations for 8.9M+ 3D assets. The dataset and annotation pipeline are described in the main paper.


📦 MARVEL-40M+ Dataset

MARVEL-FX3D is trained on MARVEL-40M+, the largest 3D captioning dataset to date.

Property Value
Total Annotations 40 million
3D Assets 8.9 million+
Source Datasets 7 major 3D repositories
Annotation Levels Detailed (150–200 words) → Tags (10–20 words)

The multi-stage annotation pipeline combines open-source multi-view VLMs and LLMs with human metadata from source datasets, reducing hallucinations and improving domain-specific accuracy.

🔗 Dataset on Hugging Face: MARVEL-40M+


🛠️ Installation

# Clone the repository
git clone https://github.com/SadilKhan/MARVEL-FX3D.git
cd MARVEL-FX3D

# Create a conda environment (recommended)
conda create -n marvel python=3.10
conda activate marvel

# Install dependencies
pip install -r requirements.txt

Requirements: Python ≥ 3.10, PyTorch ≥ 2.0, CUDA ≥ 11.8 (recommended)


🚀 Quick Start

# Generate a 3D textured mesh from a text prompt
python generate.py --prompt "A Harley Davidson motorcycle with a black leather seat and dual exhaust pipes"

The output mesh will be saved to output/ by default.

📜 Citation

If you find MARVEL-FX3D or MARVEL-40M+ useful in your research, please cite:

@InProceedings{Sinha_2025_CVPR,
    author    = {Sinha, Sankalp and Khan, Mohammad Sadil and Usama, Muhammad and Sam, Shino and Stricker, Didier and Ali, Sk Aziz and Afzal, Muhammad Zeshan},
    title     = {MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2025},
}

📄 License

This project is released under the MIT License.


If you have questions, feel free to open an issue or reach out via the project page.

About

[CVPR 2025] Official Implementation of Marvel-FX3D from MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors