Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders Paper • 2601.10332 • Published Jan 15 • 32
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 164
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 ariG23498, sergiopaniego, reach-vb, pcuenq, ArthurZ, SaylorTwift, cyrilvallez • Sep 11, 2025 • 188
Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling Paper • 2509.00605 • Published Aug 30, 2025 • 43
Beyond Transcription: Mechanistic Interpretability in ASR Paper • 2508.15882 • Published Aug 21, 2025 • 89
view article Article LoRA training scripts of the world, unite! linoyts, multimodalart • Jan 2, 2024 • 79
view article Article Advanced Flux Dreambooth LoRA Training with 🧨 diffusers linoyts • Oct 21, 2024 • 42
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 777
view article Article cocogold: training Marigold for text-grounded segmentation pcuenq • Jul 8, 2025 • 31
view article Article Train 400x faster Static Embedding Models with Sentence Transformers tomaarsen • Jan 15, 2025 • 230
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated Dec 23, 2025 • 309
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 674