High-quality text-to-speech synthesis supporting Japanese and English.
This demo uses ONNX models for fast CPU inference.