Anush Mohan
anushmohan
AI & ML interests
None yet
Organizations
None yet
video
Multimodal
-
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Paper • 2408.08872 • Published • 101 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 63 -
Building and better understanding vision-language models: insights and future directions
Paper • 2408.12637 • Published • 133
Alignment
video
ImageGen
Multimodal
-
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Paper • 2408.08872 • Published • 101 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 63 -
Building and better understanding vision-language models: insights and future directions
Paper • 2408.12637 • Published • 133