Rhapsody

non-profit

AI & ML interests

ANTISCOOPING ANYTHING

Recent Activity

yiye2023 submitted a paper 23 days ago

AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

yiye2023 authored a paper 24 days ago

AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

Cuiunbo authored a paper 5 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

View all activity

submitted a paper to Daily Papers 23 days ago

AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

Paper • 2602.06540 • Published 27 days ago • 21

authored a paper 24 days ago

AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

Paper • 2602.06540 • Published 27 days ago • 21

authored a paper 5 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 55

authored a paper 5 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 55

authored 2 papers 11 months ago

VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI

Paper • 2410.11623 • Published Oct 15, 2024 • 49

Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27, 2025 • 79

updated a dataset about 1 year ago

RhapsodyAI/Multiimage-eval

Preview • Updated Feb 21, 2025 • 9 • 1

published a dataset about 1 year ago

RhapsodyAI/Multiimage-eval

Preview • Updated Feb 21, 2025 • 9 • 1

authored a paper over 1 year ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Paper • 2412.02611 • Published Dec 3, 2024 • 25

authored a paper over 1 year ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 29

tcy6

authored a paper over 1 year ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 29

authored 2 papers over 1 year ago

Enhancing Chat Language Models by Scaling High-quality Instructional Conversations

Paper • 2305.14233 • Published May 23, 2023 • 7

Tool Learning with Foundation Models

Paper • 2304.08354 • Published Apr 17, 2023 • 3

authored a paper over 1 year ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 29

updated a model over 1 year ago

RhapsodyAI/MiniCPM-V-Embedding-preview

Feature Extraction • Updated Aug 20, 2024 • 36 • 50

updated a model over 1 year ago

RhapsodyAI/qwen_vl_guidance

Visual Question Answering • Updated Aug 13, 2024 • 3 • 4

updated a dataset over 1 year ago

RhapsodyAI/UltraVL

Viewer • Updated Aug 9, 2024 • 215k • 21 • 3

authored a paper over 1 year ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3, 2024 • 92

authored a paper over 1 year ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3, 2024 • 92

authored a paper over 1 year ago

GSLB: The Graph Structure Learning Benchmark

Paper • 2310.05174 • Published Oct 8, 2023