The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 4 days ago • 60
SpaceVista: All-Scale Visual Spatial Reasoning from mm to km Paper • 2510.09606 • Published Oct 10 • 17
UniSS: Unified Expressive Speech-to-Speech Translation with Your Voice Paper • 2509.21144 • Published Sep 25 • 1
Running 3.6k The Ultra-Scale Playbook 🌌 3.6k The ultimate guide to training LLM on large GPU Clusters
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers Paper • 2406.05370 • Published Jun 8, 2024 • 18
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper • 2405.00233 • Published Apr 30, 2024 • 17