ThreadAbort commited on
Commit
2b966fd
Β·
1 Parent(s): f417d27

Add HuggingFace model card with ONNX download links

Browse files
.gitattributes CHANGED
@@ -51,3 +51,8 @@ indextts/utils/maskgct/models/codec/facodec/modules/JDC/bst.t7 filter=lfs diff=l
51
  examples/* filter=lfs diff=lfs merge=lfs -text
52
  *.wav filter=lfs diff=lfs merge=lfs -text
53
  *. filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
51
  examples/* filter=lfs diff=lfs merge=lfs -text
52
  *.wav filter=lfs diff=lfs merge=lfs -text
53
  *. filter=lfs diff=lfs merge=lfs -text
54
+ .onnx filter=lfs diff=lfs merge=lfs -text
55
+ .wav filter=lfs diff=lfs merge=lfs -text
56
+ .mp3 filter=lfs diff=lfs merge=lfs -text
57
+ .flac filter=lfs diff=lfs merge=lfs -text
58
+ *.onnx.data filter=lfs diff=lfs merge=lfs -text
.gitignore CHANGED
@@ -15,7 +15,24 @@ build/
15
  .venv
16
  checkpoints/*
17
  __MACOSX
18
-
 
 
 
 
 
 
 
19
  # Rust build artifacts
20
  /target/
21
  **/*.rs.bk
 
 
 
 
 
 
 
 
 
 
 
15
  .venv
16
  checkpoints/*
17
  __MACOSX
18
+ .lock
19
+ # Python build artifacts
20
+ *.py[cod]
21
+ *.egg-info/
22
+ .venv
23
+ build/
24
+ dist/
25
+ *.egg-info/
26
  # Rust build artifacts
27
  /target/
28
  **/*.rs.bk
29
+ .venv/
30
+ .claude-flow/
31
+ **/target/
32
+ indexout/
33
+ output.wav
34
+ *.wav
35
+ *.flac
36
+ .swarm/
37
+ .claude/
38
+ clone_chris.py
README.md CHANGED
@@ -1,7 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # IndexTTS-Rust
2
 
3
  High-performance Text-to-Speech Engine in Pure Rust πŸš€
4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  A complete Rust rewrite of the IndexTTS system, designed for maximum performance and efficiency.
6
 
7
  ## Features
@@ -210,10 +253,29 @@ Performance on AMD Ryzen 9 5950X (16 cores):
210
  - [ ] Model quantization (INT8)
211
  - [ ] WebAssembly support
212
 
 
 
 
 
 
 
 
 
 
 
 
 
 
213
  ## License
214
 
215
  MIT License - See LICENSE file for details.
216
 
 
 
 
 
 
 
217
  ## Acknowledgments
218
 
219
  - Original IndexTTS Python implementation
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - text-to-speech
5
+ - tts
6
+ - voice-cloning
7
+ - zero-shot
8
+ - rust
9
+ - onnx
10
+ language:
11
+ - en
12
+ - zh
13
+ library_name: ort
14
+ pipeline_tag: text-to-speech
15
+ ---
16
+
17
  # IndexTTS-Rust
18
 
19
  High-performance Text-to-Speech Engine in Pure Rust πŸš€
20
 
21
+ ## ONNX Models (Download)
22
+
23
+ Pre-converted models for inference - no Python required!
24
+
25
+ | Model | Size | Download |
26
+ |-------|------|----------|
27
+ | **BigVGAN** (vocoder) | 433 MB | [bigvgan.onnx](https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/bigvgan.onnx) |
28
+ | **Speaker Encoder** | 28 MB | [speaker_encoder.onnx](https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/speaker_encoder.onnx) |
29
+
30
+ ### Quick Download
31
+
32
+ ```python
33
+ # Python with huggingface_hub
34
+ from huggingface_hub import hf_hub_download
35
+
36
+ bigvgan = hf_hub_download("ThreadAbort/IndexTTS-Rust", "models/bigvgan.onnx", revision="models")
37
+ speaker = hf_hub_download("ThreadAbort/IndexTTS-Rust", "models/speaker_encoder.onnx", revision="models")
38
+ ```
39
+
40
+ ```bash
41
+ # Or with wget
42
+ wget https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/bigvgan.onnx
43
+ wget https://huggingface.co/ThreadAbort/IndexTTS-Rust/resolve/models/models/speaker_encoder.onnx
44
+ ```
45
+
46
+ ---
47
+
48
  A complete Rust rewrite of the IndexTTS system, designed for maximum performance and efficiency.
49
 
50
  ## Features
 
253
  - [ ] Model quantization (INT8)
254
  - [ ] WebAssembly support
255
 
256
+ ## Marine Prosody Validation
257
+
258
+ This project includes **Marine salience detection** - an O(1) algorithm that validates speech authenticity:
259
+
260
+ ```
261
+ Human speech has NATURAL jitter - that's what makes it authentic!
262
+ - Too perfect (jitter < 0.005) = robotic
263
+ - Too chaotic (jitter > 0.3) = artifacts/damage
264
+ - Sweet spot = real human voice
265
+ ```
266
+
267
+ The Marines will KNOW if your TTS doesn't sound authentic! πŸŽ–οΈ
268
+
269
  ## License
270
 
271
  MIT License - See LICENSE file for details.
272
 
273
+ ---
274
+
275
+ *From ashes to harmonics, from silence to song* πŸ”₯🎡
276
+
277
+ Built with love by Hue & Aye @ [8b.is](https://8b.is)
278
+
279
  ## Acknowledgments
280
 
281
  - Original IndexTTS Python implementation
models/bigvgan.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31609a2a49ab4e00d14924eb036f2852c88198ad250de228ae972601e67e032f
3
+ size 2269152
models/speaker_encoder.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f8bc6e37803c99ebcf24cb5e1631bc1a1da00b4acc9ec6ec4c105a3e1f1f5388
3
+ size 2334876
models/speaker_encoder.onnx.data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d21f2c5de55f48af7319230818262da91442e7f3dcd29d828215e8ee9e1d7e3
3
+ size 27656192