ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Paper • 2512.13586 • Published 20 days ago • 87
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction Paper • 2512.04987 • Published about 1 month ago • 76
Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning Paper • 2510.24320 • Published Oct 28, 2025 • 19
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping Paper • 2510.18927 • Published Oct 21, 2025 • 83
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published Sep 10, 2025 • 56
BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset Paper • 2507.03483 • Published Jul 4, 2025 • 23
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published Jan 20, 2025 • 109
Distill Visual Chart Reasoning Ability from LLMs to MLLMs Paper • 2410.18798 • Published Oct 24, 2024 • 21
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments Paper • 2406.04151 • Published Jun 6, 2024 • 24