Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving Paper • 2407.00079 • Published Jun 24, 2024 • 5
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper • 2501.12599 • Published Jan 22, 2025 • 126
MoBA: Mixture of Block Attention for Long-Context LLMs Paper • 2502.13189 • Published Feb 18, 2025 • 17