iFairy: the First 2-bit Complex LLM with All Parameters in {pm1, pm i} Paper • 2508.05571 • Published Aug 7, 2025 • 1
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 24 items • Updated less than a minute ago • 87
LLM For Smartphone Collection These are some of the best llm that can run on a smartphone. These models go toe-to-toe with much larger models, and are great for use on the go. • 4 items • Updated 22 days ago • 22
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models Paper • 2512.08829 • Published Dec 9, 2025 • 20
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published May 8, 2025 • 86
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10, 2025 • 153
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published Dec 11, 2024 • 54