5 15 20

Zuhao Yang

mwxely

https://mwxely.github.io/

AI & ML interests

Large Multimodal Models

Recent Activity

upvoted a paper 3 days ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

upvoted a paper 3 days ago

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

upvoted a paper 10 days ago

Agent Learning via Early Experience

View all activity

Organizations

upvoted 2 papers 3 days ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Paper • 2512.17532 • Published 7 days ago • 62

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published 4 days ago • 60

upvoted 3 papers 10 days ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 269

Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Paper • 2511.14993 • Published Nov 19 • 226

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6 • 210

New activity in mwxely/TransitBench 10 days ago

[bot] Conversion to Parquet

#1 opened 6 months ago by

parquet-converter

When do you release the code?

#2 opened 4 months ago by

zhangzb

authored a paper 10 days ago

A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models

Paper • 2511.15098 • Published Nov 19

updated 3 datasets 16 days ago

liked a dataset 16 days ago

longvideotool/VideoSIAH-Eval

Viewer • Updated 16 days ago • 1.28k • 119 • 2

published a dataset 16 days ago

longvideotool/VideoSIAH-Eval

Viewer • Updated 16 days ago • 1.28k • 119 • 2

commented a paper 18 days ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published about 1 month ago • 154 •

liked a dataset 20 days ago

longvideotool/LongVT-Demo

Viewer • Updated about 1 month ago • 5 • 401 • 1

New activity in longvideotool/LongVT-Source 20 days ago

[bot] Conversion to Parquet

#2 opened 20 days ago by

parquet-converter

New activity in longvideotool/LongVT-Source 21 days ago

Missing “wemath (WeMath data)” zip file in LongVT-Source dataset training data

#1 opened 21 days ago by

Seele77

upvoted 2 collections 21 days ago

Multimodal Agent

Collection

123 items • Updated 3 days ago • 1

AI Paper of the Day

Collection

A collection of papers that I think are interesting, one added each day • 550 items • Updated about 18 hours ago • 73

updated a model 22 days ago

longvideotool/LongVT-SFT

Video-Text-to-Text • Updated 22 days ago • 104 • 1

Zuhao Yang

AI & ML interests

Recent Activity

Organizations

mwxely's activity

[bot] Conversion to Parquet

When do you release the code?

[bot] Conversion to Parquet

Missing “wemath (WeMath data)” zip file in LongVT-Source dataset training data