Add HuggingFace model card with ONNX download links

Browse files

Files changed (6) hide show

.gitattributes +5 -0
.gitignore +18 -1
README.md +62 -0
models/bigvgan.onnx +3 -0
models/speaker_encoder.onnx +3 -0
models/speaker_encoder.onnx.data +3 -0

.gitattributes CHANGED Viewed

@@ -51,3 +51,8 @@ indextts/utils/maskgct/models/codec/facodec/modules/JDC/bst.t7 filter=lfs diff=l
 examples/* filter=lfs diff=lfs merge=lfs -text
 *.wav filter=lfs diff=lfs merge=lfs -text
 *. filter=lfs diff=lfs merge=lfs -text

 examples/* filter=lfs diff=lfs merge=lfs -text
 *.wav filter=lfs diff=lfs merge=lfs -text
 *. filter=lfs diff=lfs merge=lfs -text
+.onnx filter=lfs diff=lfs merge=lfs -text
+.wav filter=lfs diff=lfs merge=lfs -text
+.mp3 filter=lfs diff=lfs merge=lfs -text
+.flac filter=lfs diff=lfs merge=lfs -text
+*.onnx.data filter=lfs diff=lfs merge=lfs -text

.gitignore CHANGED Viewed

@@ -15,7 +15,24 @@ build/
 .venv
 checkpoints/*
 __MACOSX
 # Rust build artifacts
 /target/
 **/*.rs.bk

 .venv
 checkpoints/*
 __MACOSX
+.lock
+# Python build artifacts
+*.py[cod]
+*.egg-info/
+.venv
+build/
+dist/
+*.egg-info/
 # Rust build artifacts
 /target/
 **/*.rs.bk
+.venv/
+.claude-flow/
+**/target/
+indexout/
+output.wav
+*.wav
+*.flac
+.swarm/
+.claude/
+clone_chris.py

README.md CHANGED Viewed

@@ -1,7 +1,50 @@
 # IndexTTS-Rust
 High-performance Text-to-Speech Engine in Pure Rust 🚀
 A complete Rust rewrite of the IndexTTS system, designed for maximum performance and efficiency.
 ## Features
@@ -210,10 +253,29 @@ Performance on AMD Ryzen 9 5950X (16 cores):
 - [ ] Model quantization (INT8)
 - [ ] WebAssembly support
 ## License
 MIT License - See LICENSE file for details.
 ## Acknowledgments
 - Original IndexTTS Python implementation

+---
+license: mit
+tags:
+  - text-to-speech
+  - tts
+  - voice-cloning
+  - zero-shot
+  - rust
+  - onnx
+language:
+  - en
+  - zh
+library_name: ort
+pipeline_tag: text-to-speech
+---
 # IndexTTS-Rust
 High-performance Text-to-Speech Engine in Pure Rust 🚀
+## ONNX Models (Download)
+Pre-converted models for inference - no Python required!
+| Model | Size | Download |
+|-------|------|----------|
+| **BigVGAN** (vocoder) | 433 MB | [bigvgan.onnx](https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/bigvgan.onnx) |
+| **Speaker Encoder** | 28 MB | [speaker_encoder.onnx](https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/speaker_encoder.onnx) |
+### Quick Download
+```python
+# Python with huggingface_hub
+from huggingface_hub import hf_hub_download
+bigvgan = hf_hub_download("ThreadAbort/IndexTTS-Rust", "models/bigvgan.onnx", revision="models")
+speaker = hf_hub_download("ThreadAbort/IndexTTS-Rust", "models/speaker_encoder.onnx", revision="models")
+```
+```bash
+# Or with wget
+wget https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/bigvgan.onnx
+wget https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/speaker_encoder.onnx
+```
+---
 A complete Rust rewrite of the IndexTTS system, designed for maximum performance and efficiency.
 ## Features
 - [ ] Model quantization (INT8)
 - [ ] WebAssembly support
+## Marine Prosody Validation
+This project includes **Marine salience detection** - an O(1) algorithm that validates speech authenticity:
+```
+Human speech has NATURAL jitter - that's what makes it authentic!
+- Too perfect (jitter < 0.005) = robotic
+- Too chaotic (jitter > 0.3) = artifacts/damage
+- Sweet spot = real human voice
+```
+The Marines will KNOW if your TTS doesn't sound authentic! 🎖️
 ## License
 MIT License - See LICENSE file for details.
+---
+*From ashes to harmonics, from silence to song* 🔥🎵
+Built with love by Hue & Aye @ [8b.is](https://8b.is)
 ## Acknowledgments
 - Original IndexTTS Python implementation

models/bigvgan.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:31609a2a49ab4e00d14924eb036f2852c88198ad250de228ae972601e67e032f
+size 2269152

models/speaker_encoder.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f8bc6e37803c99ebcf24cb5e1631bc1a1da00b4acc9ec6ec4c105a3e1f1f5388
+size 2334876

models/speaker_encoder.onnx.data ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1d21f2c5de55f48af7319230818262da91442e7f3dcd29d828215e8ee9e1d7e3
+size 27656192