sdtana's picture

sdtana

sdtana

·

roxani_17

AI & ML interests

None yet

Organizations

upvoted 2 papers 3 months ago

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

Paper • 2511.19365 • Published Nov 24, 2025 • 64

Back to Basics: Let Denoising Generative Models Denoise

Paper • 2511.13720 • Published Nov 17, 2025 • 69

upvoted a paper 4 months ago

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13, 2025 • 166

upvoted a paper 5 months ago

Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation

Paper • 2509.15185 • Published Sep 18, 2025 • 29

upvoted a paper 6 months ago

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 297

upvoted 3 papers 7 months ago

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Paper • 2507.21033 • Published Jul 28, 2025 • 23

A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17, 2025 • 261

SingLoRA: Low Rank Adaptation Using a Single Matrix

Paper • 2507.05566 • Published Jul 8, 2025 • 115

upvoted 5 papers 8 months ago

Guidance in the Frequency Domain Enables High-Fidelity Sampling at Low CFG Scales

Paper • 2506.19713 • Published Jun 24, 2025 • 14

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19, 2025 • 130

SeqPE: Transformer with Sequential Position Encoding

Paper • 2506.13277 • Published Jun 16, 2025 • 4

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Paper • 2506.08343 • Published Jun 10, 2025 • 54

Institutional Books 1.0: A 242B token dataset from Harvard Library's collections, refined for accuracy and usability

Paper • 2506.08300 • Published Jun 10, 2025 • 9

upvoted 7 papers 9 months ago

Time Blindness: Why Video-Language Models Can't See What Humans Can?

Paper • 2505.24867 • Published May 30, 2025 • 82

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 143

To Trust Or Not To Trust Your Vision-Language Model's Prediction

Paper • 2505.23745 • Published May 29, 2025 • 4

D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29, 2025 • 34

Diffusion Classifiers Understand Compositionality, but Conditions Apply

Paper • 2505.17955 • Published May 23, 2025 • 22

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Paper • 2505.19297 • Published May 25, 2025 • 84

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 332