OpenMOSS presents a collection of our research on LLMs, supported by SII, Fudan and Mosi.
Release Date: June 2025
- π€ HuggingFace: MOSS-TTSD Models
- π» GitHub: Source Code & Implementation
Coming soon!
Coming soon!
Language-Model-SAEs is a comprehensive, fully-distributed framework designed for training, analyzing and visualizing Sparse Autoencoders (SAEs), empowering scalable and systematic Mechanistic Interpretability research.
- π€ HuggingFace: Llama Scope
- π Neuronpedia: Llama Scope Visualization
- π» GitHub: Language-Model-SAEs
The Embodied AI Team empowers large models to execute real-world tasks, aiming to automate tedious chores and unlock superhuman intelligence through environmental interaction. We believe true AI emerges from engaging with the physical world.
- VLABench arXiv Github ICCV 2025
- The first robot manipulation benchmark designed to evalute the multi-dimensional ability of general purpose Vision-Language-Action Models.
- Dual Preference Optimization for Embodied Task Planning arXiv Github ACL 2025
- A unified learning framework that empowers embodied agents with stronger world modeling and embodied planning ability via dual preference optimization.
- World-Aware-Planning arXiv Github
- An innovative world-aware narrative enhancement approach, bridging the gap between high-level task instructions and nuanced details of real-world environment.
- Embodied-Planner-R1 arXiv Github
- A reinforcement learning framework that enables LLMs to acquire embodied planning capabilities through autonomous exploration with sparse rewards, achieving breakthrough performance in planning tasks that require environmental interaction.
SII-OpenMOSS New Architecture Team explores new architectures and paradigms of LLMs, from the perspective of improving the long-context capability and efficiency of LLMs
- ReAttention arXiv Github ICLR 2025
- A training-free approach that enables LLM to support an infinite context in length extrapolation with finite attention scope.
- FouierAttention arXiv
- A training-free framework that exploits the heterogeneous roles of transformer head dimensions.
- LongLLaDA arXiv Github
- The first systematic investigation comparing the long-context performance of diffusion LLMs and traditional auto-regressive LLMs.
- Thus Spake Long-Context LLM arXiv Github
- A global picture of the lifecycle of long-context LLMs from four perspectives: architecture, infrastructure, training, and evaluation.