Update README.md
Browse files
README.md
CHANGED
|
@@ -1,9 +1,111 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
tags:
|
| 3 |
-
-
|
| 4 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
---
|
| 6 |
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
tags:
|
| 6 |
+
- hearing loss
|
| 7 |
+
- challenge
|
| 8 |
+
- signal processing
|
| 9 |
+
- source separation
|
| 10 |
+
- lyrics intelligibility
|
| 11 |
+
- audio
|
| 12 |
+
- audio-to-audio
|
| 13 |
---
|
| 14 |
|
| 15 |
+
# Cadenza Challenge: CAD2-Task1
|
| 16 |
+
|
| 17 |
+
A NonCausal Lyrics/Accompaniment separation model for the CAD2-Task1 baseline system.
|
| 18 |
+
|
| 19 |
+
* Architecture: ConvTasNet (Kaituo XU) with multichannel support (Alexandre Defossez).
|
| 20 |
+
* Parameters:
|
| 21 |
+
* B: 256
|
| 22 |
+
* C: 2
|
| 23 |
+
* H: 512
|
| 24 |
+
* L: 20
|
| 25 |
+
* N: 256
|
| 26 |
+
* P: 3
|
| 27 |
+
* R: 4
|
| 28 |
+
* X: 10
|
| 29 |
+
* audio_channels: 2
|
| 30 |
+
* causal: false
|
| 31 |
+
* mask_nonlinear: relu
|
| 32 |
+
* norm_type: gLN
|
| 33 |
+
* training:
|
| 34 |
+
* sample_rate: 44100
|
| 35 |
+
* samples_per_track: 64
|
| 36 |
+
* segment: 5.0
|
| 37 |
+
* aggregate: 2
|
| 38 |
+
* batch_size: 4
|
| 39 |
+
* early_stop: true
|
| 40 |
+
* epochs: 200
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
## Dataset
|
| 44 |
+
The model was trained on the training split of the MUSDB18-HQ dataset.
|
| 45 |
+
|
| 46 |
+
## How to use
|
| 47 |
+
|
| 48 |
+
```
|
| 49 |
+
from tasnet import ConvTasNetStereo
|
| 50 |
+
|
| 51 |
+
model = ConvTasNetStereo.from_pretrained(
|
| 52 |
+
"cadenzachallenge/ConvTasNet_LyricsSeparation_NonCausal"
|
| 53 |
+
).cpu()
|
| 54 |
+
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
## Results
|
| 58 |
+
|
| 59 |
+
| Track | Vocals (SDR) | Accompaniment (SDR) |
|
| 60 |
+
|:------|:------------:|:---------:|
|
| 61 |
+
| Al James - Schoolboy Facination | 6.841 | 9.074 |
|
| 62 |
+
| AM Contra - Heart Peripheral | 6.948 | 14.105 |
|
| 63 |
+
| Angels In Amplifiers - I'm Alright | 7.358 | 10.859 |
|
| 64 |
+
| Arise - Run Run Run | 6.105 | 16.806 |
|
| 65 |
+
| Ben Carrigan - We'll Talk About It All Tonight | 2.853 | 10.181 |
|
| 66 |
+
| BKS - Bulldozer | 1.909 | 13.944 |
|
| 67 |
+
| BKS - Too Much | 8.615 | 13.212 |
|
| 68 |
+
| Bobby Nobody - Stitch Up | 7.948 | 12.685 |
|
| 69 |
+
| Buitraker - Revo X | 4.609 | 14.61 |
|
| 70 |
+
| Carlos Gonzalez - A Place For Us | 4.235 | 8.888 |
|
| 71 |
+
| Cristina Vane - So Easy | 8.759 | 13.639 |
|
| 72 |
+
| Detsky Sad - Walkie Talkie | 7.732 | 10.844 |
|
| 73 |
+
| Enda Reilly - Cur An Long Ag Seol | 9.603 | 13.723 |
|
| 74 |
+
| Forkupines - Semantics | 4.955 | 11.561 |
|
| 75 |
+
| Georgia Wonder - Siren | 4.124 | 8.578 |
|
| 76 |
+
| Girls Under Glass - We Feel Alright | 4.38 | 12.272 |
|
| 77 |
+
| Hollow Ground - Ill Fate | 7.046 | 16.299 |
|
| 78 |
+
| James Elder & Mark M Thompson - The English Actor | 4.694 | 9.638 |
|
| 79 |
+
| Juliet's Rescue - Heartbeats | 6.281 | 14.409 |
|
| 80 |
+
| Little Chicago's Finest - My Own | 6.313 | 6.603 |
|
| 81 |
+
| Louis Cressy Band - Good Time | 6.501 | 12.016 |
|
| 82 |
+
| Lyndsey Ollard - Catching Up | 9.18 | 12.116 |
|
| 83 |
+
| M.E.R.C. Music - Knockout | 6.619 | 8.507 |
|
| 84 |
+
| Moosmusic - Big Dummy Shake | 8.097 | 14.578 |
|
| 85 |
+
| Motor Tapes - Shore | 0.769 | 10.137 |
|
| 86 |
+
| Mu - Too Bright | 5.853 | 13.135 |
|
| 87 |
+
| Nerve 9 - Pray For The Rain | 6.425 | 14.427 |
|
| 88 |
+
| PR - Happy Daze | 0 | 51.092 |
|
| 89 |
+
| PR - Oh No | 0 | 9.021 |
|
| 90 |
+
| Punkdisco - Oral Hygiene | 5.725 | 17.681 |
|
| 91 |
+
| Raft Monk - Tiring | 2.378 | 9.244 |
|
| 92 |
+
| Sambasevam Shanmugam - Kaathaadi | 8.164 | 10.588 |
|
| 93 |
+
| Secretariat - Borderline | 5.522 | 10.817 |
|
| 94 |
+
| Secretariat - Over The Top | 7.859 | 14.996 |
|
| 95 |
+
| Side Effects Project - Sing With Me | 11.197 | 12.63 |
|
| 96 |
+
| Signe Jakobsen - What Have You Done To Me | 7.685 | 11.013 |
|
| 97 |
+
| Skelpolu - Resurrection | 0 | 7.603 |
|
| 98 |
+
| Speak Softly - Broken Man | 3.997 | 14.516 |
|
| 99 |
+
| Speak Softly - Like Horses | 6.462 | 9.426 |
|
| 100 |
+
| The Doppler Shift - Atrophy | 0.711 | 14.241 |
|
| 101 |
+
| The Easton Ellises - Falcon 69 | 2.401 | 7.889 |
|
| 102 |
+
| The Easton Ellises (Baumi) - SDRNR | 1.479 | 7.948 |
|
| 103 |
+
| The Long Wait - Dark Horses | 6.53 | 12.661 |
|
| 104 |
+
| The Mountaineering Club - Mallory | 10.665 | 15.311 |
|
| 105 |
+
| The Sunshine Garcia Band - For I Am The Moon | 9.591 | 13.297 |
|
| 106 |
+
| Timboz - Pony | 4.025 | 14.271 |
|
| 107 |
+
| Tom McKenzie - Directions | 8.031 | 16.129 |
|
| 108 |
+
| Triviul feat. The Fiend - Widow | 7.061 | 8.168 |
|
| 109 |
+
| We Fell From The Sky - Not You | 3.862 | 11.685 |
|
| 110 |
+
| Zeno - Signs | 6.364 | 11.552 |
|
| 111 |
+
| **Total (median over frames, median over tracks)** | **6.338** | **12.194** |
|