Commit
Β·
2b966fd
1
Parent(s):
f417d27
Add HuggingFace model card with ONNX download links
Browse files- .gitattributes +5 -0
- .gitignore +18 -1
- README.md +62 -0
- models/bigvgan.onnx +3 -0
- models/speaker_encoder.onnx +3 -0
- models/speaker_encoder.onnx.data +3 -0
.gitattributes
CHANGED
|
@@ -51,3 +51,8 @@ indextts/utils/maskgct/models/codec/facodec/modules/JDC/bst.t7 filter=lfs diff=l
|
|
| 51 |
examples/* filter=lfs diff=lfs merge=lfs -text
|
| 52 |
*.wav filter=lfs diff=lfs merge=lfs -text
|
| 53 |
*. filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
examples/* filter=lfs diff=lfs merge=lfs -text
|
| 52 |
*.wav filter=lfs diff=lfs merge=lfs -text
|
| 53 |
*. filter=lfs diff=lfs merge=lfs -text
|
| 54 |
+
.onnx filter=lfs diff=lfs merge=lfs -text
|
| 55 |
+
.wav filter=lfs diff=lfs merge=lfs -text
|
| 56 |
+
.mp3 filter=lfs diff=lfs merge=lfs -text
|
| 57 |
+
.flac filter=lfs diff=lfs merge=lfs -text
|
| 58 |
+
*.onnx.data filter=lfs diff=lfs merge=lfs -text
|
.gitignore
CHANGED
|
@@ -15,7 +15,24 @@ build/
|
|
| 15 |
.venv
|
| 16 |
checkpoints/*
|
| 17 |
__MACOSX
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
# Rust build artifacts
|
| 20 |
/target/
|
| 21 |
**/*.rs.bk
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
.venv
|
| 16 |
checkpoints/*
|
| 17 |
__MACOSX
|
| 18 |
+
.lock
|
| 19 |
+
# Python build artifacts
|
| 20 |
+
*.py[cod]
|
| 21 |
+
*.egg-info/
|
| 22 |
+
.venv
|
| 23 |
+
build/
|
| 24 |
+
dist/
|
| 25 |
+
*.egg-info/
|
| 26 |
# Rust build artifacts
|
| 27 |
/target/
|
| 28 |
**/*.rs.bk
|
| 29 |
+
.venv/
|
| 30 |
+
.claude-flow/
|
| 31 |
+
**/target/
|
| 32 |
+
indexout/
|
| 33 |
+
output.wav
|
| 34 |
+
*.wav
|
| 35 |
+
*.flac
|
| 36 |
+
.swarm/
|
| 37 |
+
.claude/
|
| 38 |
+
clone_chris.py
|
README.md
CHANGED
|
@@ -1,7 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# IndexTTS-Rust
|
| 2 |
|
| 3 |
High-performance Text-to-Speech Engine in Pure Rust π
|
| 4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
A complete Rust rewrite of the IndexTTS system, designed for maximum performance and efficiency.
|
| 6 |
|
| 7 |
## Features
|
|
@@ -210,10 +253,29 @@ Performance on AMD Ryzen 9 5950X (16 cores):
|
|
| 210 |
- [ ] Model quantization (INT8)
|
| 211 |
- [ ] WebAssembly support
|
| 212 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 213 |
## License
|
| 214 |
|
| 215 |
MIT License - See LICENSE file for details.
|
| 216 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 217 |
## Acknowledgments
|
| 218 |
|
| 219 |
- Original IndexTTS Python implementation
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- text-to-speech
|
| 5 |
+
- tts
|
| 6 |
+
- voice-cloning
|
| 7 |
+
- zero-shot
|
| 8 |
+
- rust
|
| 9 |
+
- onnx
|
| 10 |
+
language:
|
| 11 |
+
- en
|
| 12 |
+
- zh
|
| 13 |
+
library_name: ort
|
| 14 |
+
pipeline_tag: text-to-speech
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
# IndexTTS-Rust
|
| 18 |
|
| 19 |
High-performance Text-to-Speech Engine in Pure Rust π
|
| 20 |
|
| 21 |
+
## ONNX Models (Download)
|
| 22 |
+
|
| 23 |
+
Pre-converted models for inference - no Python required!
|
| 24 |
+
|
| 25 |
+
| Model | Size | Download |
|
| 26 |
+
|-------|------|----------|
|
| 27 |
+
| **BigVGAN** (vocoder) | 433 MB | [bigvgan.onnx](https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/bigvgan.onnx) |
|
| 28 |
+
| **Speaker Encoder** | 28 MB | [speaker_encoder.onnx](https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/speaker_encoder.onnx) |
|
| 29 |
+
|
| 30 |
+
### Quick Download
|
| 31 |
+
|
| 32 |
+
```python
|
| 33 |
+
# Python with huggingface_hub
|
| 34 |
+
from huggingface_hub import hf_hub_download
|
| 35 |
+
|
| 36 |
+
bigvgan = hf_hub_download("ThreadAbort/IndexTTS-Rust", "models/bigvgan.onnx", revision="models")
|
| 37 |
+
speaker = hf_hub_download("ThreadAbort/IndexTTS-Rust", "models/speaker_encoder.onnx", revision="models")
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
```bash
|
| 41 |
+
# Or with wget
|
| 42 |
+
wget https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/bigvgan.onnx
|
| 43 |
+
wget https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/speaker_encoder.onnx
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
---
|
| 47 |
+
|
| 48 |
A complete Rust rewrite of the IndexTTS system, designed for maximum performance and efficiency.
|
| 49 |
|
| 50 |
## Features
|
|
|
|
| 253 |
- [ ] Model quantization (INT8)
|
| 254 |
- [ ] WebAssembly support
|
| 255 |
|
| 256 |
+
## Marine Prosody Validation
|
| 257 |
+
|
| 258 |
+
This project includes **Marine salience detection** - an O(1) algorithm that validates speech authenticity:
|
| 259 |
+
|
| 260 |
+
```
|
| 261 |
+
Human speech has NATURAL jitter - that's what makes it authentic!
|
| 262 |
+
- Too perfect (jitter < 0.005) = robotic
|
| 263 |
+
- Too chaotic (jitter > 0.3) = artifacts/damage
|
| 264 |
+
- Sweet spot = real human voice
|
| 265 |
+
```
|
| 266 |
+
|
| 267 |
+
The Marines will KNOW if your TTS doesn't sound authentic! ποΈ
|
| 268 |
+
|
| 269 |
## License
|
| 270 |
|
| 271 |
MIT License - See LICENSE file for details.
|
| 272 |
|
| 273 |
+
---
|
| 274 |
+
|
| 275 |
+
*From ashes to harmonics, from silence to song* π₯π΅
|
| 276 |
+
|
| 277 |
+
Built with love by Hue & Aye @ [8b.is](https://8b.is)
|
| 278 |
+
|
| 279 |
## Acknowledgments
|
| 280 |
|
| 281 |
- Original IndexTTS Python implementation
|
models/bigvgan.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:31609a2a49ab4e00d14924eb036f2852c88198ad250de228ae972601e67e032f
|
| 3 |
+
size 2269152
|
models/speaker_encoder.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f8bc6e37803c99ebcf24cb5e1631bc1a1da00b4acc9ec6ec4c105a3e1f1f5388
|
| 3 |
+
size 2334876
|
models/speaker_encoder.onnx.data
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1d21f2c5de55f48af7319230818262da91442e7f3dcd29d828215e8ee9e1d7e3
|
| 3 |
+
size 27656192
|