MentalBench: A Benchmark for Evaluating Psychiatric Diagnostic Capability of Large Language Models Paper β’ 2602.12871 β’ Published Feb 13 β’ 18
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper β’ 2510.11696 β’ Published Oct 13, 2025 β’ 182
VLR-Bench: Multilingual Benchmark Dataset for Vision-Language Retrieval Augmented Generation Paper β’ 2412.10151 β’ Published Dec 13, 2024 β’ 7