Official Repository: A Comprehensive Benchmark for Logical Reasoning in MLLMs
-
Updated
Jun 17, 2025 - Python
Official Repository: A Comprehensive Benchmark for Logical Reasoning in MLLMs
Live Deep Research Bench. A challenging, objective benchmark for deep research tasks.
Public preview of Society of Thought, a Qwen adapter that reasons through visible multi-persona debate, with benchmark evidence, raw traces, and demo.
Add a description, image, and links to the reasoning-benchmark topic page so that developers can more easily learn about it.
To associate your repository with the reasoning-benchmark topic, visit your repo's landing page and select "manage topics."