Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 20 days ago • 284
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 15 days ago • 320
Multimodal AI Models Collection Purpose: Models that understand text + image + audio together. • 5 items • Updated 28 days ago • 1
Audio & Speech Models Collection Purpose: Speech recognition, text-to-speech, music, audio analysis. • 5 items • Updated 28 days ago • 1
Vision Models (Image & Video) Collection Purpose: Text-to-image, image classification, detection, segmentation. • 5 items • Updated 28 days ago • 1
Text & Code Models (NLP) Collection Purpose: Text generation, summarization, translation, embeddings, coding. • 5 items • Updated 28 days ago • 1
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published Dec 31, 2025 • 151
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 226
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published Jan 8 • 166
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 212
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30, 2025 • 124
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated 24 days ago • 172
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 239
Android Models Collection LiteRT models that can run on Android • 20 items • Updated Dec 11, 2025 • 192
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 508