Papers - Attention - Cross
updated
Vid2Robot: End-to-end Video-conditioned Policy Learning with
Cross-Attention Transformers
Paper
• 2403.12943
• Published
• 15
Masked Audio Generation using a Single Non-Autoregressive Transformer
Paper
• 2401.04577
• Published
• 45
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion
Models
Paper
• 2404.02747
• Published
• 13
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image
Generation
Paper
• 2404.02733
• Published
• 22
Prompt-to-Prompt Image Editing with Cross Attention Control
Paper
• 2208.01626
• Published
• 3
Paper
• 2404.07821
• Published
• 13
HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral
Denoising
Paper
• 2404.09697
• Published
• 1
TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal
Large Language Models
Paper
• 2404.09204
• Published
• 11
Long-form music generation with latent diffusion
Paper
• 2404.10301
• Published
• 27
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper
• 2301.07093
• Published
• 4
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Paper
• 2404.14239
• Published
• 9
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Paper
• 2404.15420
• Published
• 11
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation
Paper
• 2404.19427
• Published
• 74
Unveiling Encoder-Free Vision-Language Models
Paper
• 2406.11832
• Published
• 54
TokenFormer: Rethinking Transformer Scaling with Tokenized Model
Parameters
Paper
• 2410.23168
• Published
• 24
HAT: Hybrid Attention Transformer for Image Restoration
Paper
• 2309.05239
• Published
• 1
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper
• 2412.09871
• Published
• 108