Softpick: No Attention Sink, No Massive Activations with Rectified Softmax Paper • 2504.20966 • Published Apr 29, 2025 • 31
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published Sep 4, 2025 • 75
Set Block Decoding is a Language Model Inference Accelerator Paper • 2509.04185 • Published Sep 4, 2025 • 53
Symbolic Graphics Programming with Large Language Models Paper • 2509.05208 • Published Sep 5, 2025 • 46
LLM-based Optimization of Compound AI Systems: A Survey Paper • 2410.16392 • Published Oct 21, 2024 • 16
Retrieval-augmented reasoning with lean language models Paper • 2508.11386 • Published Aug 15, 2025 • 5
Advances in Speech Separation: Techniques, Challenges, and Future Trends Paper • 2508.10830 • Published Aug 14, 2025 • 15
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation Paper • 2508.12040 • Published Aug 16, 2025 • 14
Evaluating Podcast Recommendations with Profile-Aware LLM-as-a-Judge Paper • 2508.08777 • Published Aug 12, 2025 • 15
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6, 2025 • 129
Refining Contrastive Learning and Homography Relations for Multi-Modal Recommendation Paper • 2508.13745 • Published Aug 19, 2025 • 1
mSCoRe: a Multilingual and Scalable Benchmark for Skill-based Commonsense Reasoning Paper • 2508.10137 • Published Aug 13, 2025 • 2
Leuvenshtein: Efficient FHE-based Edit Distance Computation with Single Bootstrap per Cell Paper • 2508.14568 • Published Aug 20, 2025 • 2
Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis Paper • 2508.15754 • Published Aug 21, 2025 • 4
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting Paper • 2508.11408 • Published Aug 15, 2025 • 8
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs Paper • 2508.14896 • Published Aug 20, 2025 • 22