Qwen2.5 Omni 7B Demo
๐
365
Generate text and speech responses from text, audio, images, or video input
Generate text and speech responses from text, audio, images, or video input
A text-to-speech model powered by SparkAudio and Mobvoi.
Generate voice cloned speech from text and audio