Chengsong Huang's picture

Chengsong Huang

ChengsongHuang

·

https://chengsong-huang.github.io/

hcscctv

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

upvoted a paper 22 days ago

Video-Based Reward Modeling for Computer-Use Agents

authored a paper 25 days ago

MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

View all activity

Organizations

upvoted a paper 17 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published 18 days ago • 136

upvoted a paper 22 days ago

Video-Based Reward Modeling for Computer-Use Agents

Paper • 2603.10178 • Published 25 days ago • 42

upvoted a paper 25 days ago

MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

Paper • 2603.09206 • Published 26 days ago • 52

upvoted 2 papers about 1 month ago

Surgical Post-Training: Cutting Errors, Keeping Knowledge

Paper • 2603.01683 • Published Mar 2 • 11

SkillOrchestra: Learning to Route Agents via Skill Transfer

Paper • 2602.19672 • Published Feb 23 • 57

upvoted 11 papers about 2 months ago

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 120

Experiential Reinforcement Learning

Paper • 2602.13949 • Published Feb 15 • 71

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 72

EgoAVU: Egocentric Audio-Visual Understanding

Paper • 2602.06139 • Published Feb 5 • 12

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

Paper • 2602.05885 • Published Feb 5 • 28

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published Feb 5 • 36

Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29

Steering LLMs via Scalable Interactive Oversight

Paper • 2602.04210 • Published Feb 4 • 18

Training Data Efficiency in Multimodal Process Reward Models

Paper • 2602.04145 • Published Feb 4 • 79

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

Paper • 2602.03845 • Published Feb 3 • 27

Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation

Paper • 2602.03619 • Published Feb 3 • 28

upvoted 4 papers 2 months ago

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 110

TTCS: Test-Time Curriculum Synthesis for Self-Evolving

Paper • 2601.22628 • Published Jan 30 • 35

HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences

Paper • 2601.18724 • Published Jan 26 • 7

A Pragmatic VLA Foundation Model

Paper • 2601.18692 • Published Jan 26 • 49