Emergent Social Intelligence Risks in Generative Multi-Agent Systems Paper • 2603.27771 • Published 4 days ago • 46
Learn2Fold: Structured Origami Generation with World Model Planning Paper • 2603.29585 • Published Feb 2 • 10
DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing Paper • 2603.28713 • Published 3 days ago • 15
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 13 days ago • 30
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published 8 days ago • 47
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 16 days ago • 134
Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration Paper • 2603.12226 • Published 21 days ago • 4
TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas Paper • 2603.16448 • Published 16 days ago • 58
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published 20 days ago • 145
T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search Paper • 2603.22341 • Published 12 days ago • 36
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 9 days ago • 90
AnimalCLAP: Taxonomy-Aware Language-Audio Pretraining for Species Recognition and Trait Inference Paper • 2603.22053 • Published 10 days ago • 3
Group3D: MLLM-Driven Semantic Grouping for Open-Vocabulary 3D Object Detection Paper • 2603.21944 • Published 10 days ago • 26
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning Paper • 2603.21065 • Published 11 days ago • 77
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models Paper • 2603.22212 • Published 10 days ago • 124
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published 17 days ago • 184
ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models Paper • 2603.19466 • Published 14 days ago • 41