Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Yan Ma's picture

2 1 25

Yan Ma

y22ma

21world's profile picture

yehors-cv's profile picture

·

YanMachX
y22ma

AI & ML interests

Robotics, Computer Vision, MultiModal LLM, LAM

Organizations

None yet

y22ma 's collections 4

Reasoning search

Improve Mathematical Reasoning in Language Models by Automated Process Supervision

Paper • 2406.06592 • Published Jun 5, 2024 • 29

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

Paper • 2401.11708 • Published Jan 22, 2024 • 30
Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos

Paper • 2403.13044 • Published Mar 19, 2024 • 15

GPT-4V(ision) is a Generalist Web Agent, if Grounded

Paper • 2401.01614 • Published Jan 3, 2024 • 22
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models

Paper • 2402.13577 • Published Feb 21, 2024 • 9

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2, 2024 • 58
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

Paper • 2307.12856 • Published Jul 24, 2023 • 36

Reasoning search

Improve Mathematical Reasoning in Language Models by Automated Process Supervision

Paper • 2406.06592 • Published Jun 5, 2024 • 29

GPT-4V(ision) is a Generalist Web Agent, if Grounded

Paper • 2401.01614 • Published Jan 3, 2024 • 22
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models

Paper • 2402.13577 • Published Feb 21, 2024 • 9

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

Paper • 2401.11708 • Published Jan 22, 2024 • 30
Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos

Paper • 2403.13044 • Published Mar 19, 2024 • 15

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2, 2024 • 58
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

Paper • 2307.12856 • Published Jul 24, 2023 • 36

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs