view article Article Introducing OptiMind, a research model designed for optimization microsoft • Jan 15 • 35
Guardpoint Collection Guardpoint is a structured medical reasoning, diagnosis, management, and knowledge model built on Qwen and gpt-oss! • 5 items • Updated 19 days ago • 2
Experimental Reasoning Models Collection Experimental fine-tuned models that utilize reasoning to create specific structured forms of output. • 12 items • Updated 5 days ago • 4
Reasoning Datasets Collection Synthetic datasets generated using reasoning models, primarily the Deepseek-R1 and Deepseek-V3 series. • 14 items • Updated 5 days ago • 4
Shining Valiant 3 Collection Shining Valiant 3 is a science-reasoning, LLMOps, AI architecture, and general reasoning finetune for Qwen, gpt-oss, and Ministral! • 5 items • Updated 19 days ago • 3
Esper 3.1 Collection Esper 3.1 is a DevOps, architecture, code, and general reasoning finetune for Qwen, Ministral and gpt-oss! • 8 items • Updated 19 days ago • 3
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders thomwolf, matthieu-lapeyre • Jul 9, 2025 • 796
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 773
CodeContests+: High-Quality Test Case Generation for Competitive Programming Paper • 2506.05817 • Published Jun 6, 2025 • 10
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published Jun 25, 2025 • 48
🐙 OctoThinker Collection Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated Jun 26, 2025 • 2