-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2603.05890
-
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise
Paper • 2602.12783 • Published • 216 -
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
Paper • 2602.22638 • Published • 107 -
CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty
Paper • 2601.22027 • Published • 85 -
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
Paper • 2601.11077 • Published • 67
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 18 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 102 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 42
-
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification
Paper • 2411.19638 • Published • 6 -
Word Sense Linking: Disambiguating Outside the Sandbox
Paper • 2412.09370 • Published • 10 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 163 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 377
-
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
Paper • 2509.15591 • Published • 45 -
A Survey on Latent Reasoning
Paper • 2507.06203 • Published • 94 -
Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost
Paper • 2602.03120 • Published • 1 -
TADA! Tuning Audio Diffusion Models through Activation Steering
Paper • 2602.11910 • Published • 2
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 271 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise
Paper • 2602.12783 • Published • 216 -
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
Paper • 2602.22638 • Published • 107 -
CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty
Paper • 2601.22027 • Published • 85 -
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
Paper • 2601.11077 • Published • 67
-
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
Paper • 2509.15591 • Published • 45 -
A Survey on Latent Reasoning
Paper • 2507.06203 • Published • 94 -
Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost
Paper • 2602.03120 • Published • 1 -
TADA! Tuning Audio Diffusion Models through Activation Steering
Paper • 2602.11910 • Published • 2
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 18 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 102 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 42
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 271 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification
Paper • 2411.19638 • Published • 6 -
Word Sense Linking: Disambiguating Outside the Sandbox
Paper • 2412.09370 • Published • 10 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 163 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 377