AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward Paper • 2605.12495 • Published 2 days ago • 28
jina-embeddings-v5-omni Collection Multimodal (text + image + video + audio) embedding models aligned with jina-embeddings-v5-text-*. Two sizes, four task variants each. • 27 items • Updated 1 day ago • 27
jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition Paper • 2605.08384 • Published 6 days ago • 6
Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models Paper • 2605.07721 • Published 6 days ago • 25
PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents Paper • 2605.10341 • Published 3 days ago • 30
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training Paper • 2605.09608 • Published 4 days ago • 44
TMAS: Scaling Test-Time Compute via Multi-Agent Synergy Paper • 2605.10344 • Published 3 days ago • 46
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 2 days ago • 141
δ-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 2 days ago • 97
NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation Paper • 2605.10813 • Published 3 days ago • 9
Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum Paper • 2510.00526 • Published Oct 1, 2025 • 11
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published 3 days ago • 68
view article Article Building Blocks for Foundation Model Training and Inference on AWS amazon • 3 days ago • 15
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling Paper • 2605.08083 • Published 6 days ago • 63
Flow-OPD: On-Policy Distillation for Flow Matching Models Paper • 2605.08063 • Published 6 days ago • 92
view article Article CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models lablab-ai-amd-developer-hackathon • 6 days ago • 7
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex Paper • 2605.06139 • Published 7 days ago • 64