Instructions to use WasuratS/distilhubert-finetuned-gtzan with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use WasuratS/distilhubert-finetuned-gtzan with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("audio-classification", model="WasuratS/distilhubert-finetuned-gtzan")# Load model directly from transformers import AutoProcessor, AutoModelForAudioClassification processor = AutoProcessor.from_pretrained("WasuratS/distilhubert-finetuned-gtzan") model = AutoModelForAudioClassification.from_pretrained("WasuratS/distilhubert-finetuned-gtzan") - Notebooks
- Google Colab
- Kaggle
distilhubert-finetuned-gtzan
This model is a fine-tuned version of ntu-spml/distilhubert on the GTZAN dataset. It achieves the following results on the evaluation set on best epoch:
- Loss: 0.7305
- Accuracy: 0.9
Model description
Distilhubert is distilled version of the HuBERT and pretrained on data set with 16k frequency.
Architecture of this model is CTC or Connectionist Temporal Classification is a technique that is used with encoder-only transformer.
Training and evaluation data
Training + Evaluation data set is GTZAN which is a popular dataset of 999 songs for music genre classification.
Each song is a 30-second clip from one of 10 genres of music, spanning disco to metal.
Train set is 899 songs and Evaluation set is 100 songs remainings.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 35
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 2.1728 | 1.0 | 225 | 2.0896 | 0.42 |
| 1.4211 | 2.0 | 450 | 1.4951 | 0.55 |
| 1.2155 | 3.0 | 675 | 1.0669 | 0.72 |
| 1.0175 | 4.0 | 900 | 0.8862 | 0.69 |
| 0.3516 | 5.0 | 1125 | 0.6265 | 0.83 |
| 0.6135 | 6.0 | 1350 | 0.6485 | 0.78 |
| 0.0807 | 7.0 | 1575 | 0.6567 | 0.78 |
| 0.0303 | 8.0 | 1800 | 0.7615 | 0.83 |
| 0.2663 | 9.0 | 2025 | 0.6612 | 0.86 |
| 0.0026 | 10.0 | 2250 | 0.8354 | 0.85 |
| 0.0337 | 11.0 | 2475 | 0.6768 | 0.87 |
| 0.0013 | 12.0 | 2700 | 0.7718 | 0.87 |
| 0.001 | 13.0 | 2925 | 0.7570 | 0.88 |
| 0.0008 | 14.0 | 3150 | 0.8170 | 0.89 |
| 0.0006 | 15.0 | 3375 | 0.7920 | 0.89 |
| 0.0005 | 16.0 | 3600 | 0.9859 | 0.83 |
| 0.0004 | 17.0 | 3825 | 0.8190 | 0.9 |
| 0.0003 | 18.0 | 4050 | 0.7305 | 0.9 |
| 0.0003 | 19.0 | 4275 | 0.8025 | 0.88 |
| 0.0002 | 20.0 | 4500 | 0.8208 | 0.87 |
| 0.0003 | 21.0 | 4725 | 0.7358 | 0.88 |
| 0.0002 | 22.0 | 4950 | 0.8681 | 0.87 |
| 0.0002 | 23.0 | 5175 | 0.7831 | 0.9 |
| 0.0003 | 24.0 | 5400 | 0.8583 | 0.88 |
| 0.0002 | 25.0 | 5625 | 0.8138 | 0.88 |
| 0.0002 | 26.0 | 5850 | 0.7871 | 0.89 |
| 0.0002 | 27.0 | 6075 | 0.8893 | 0.88 |
| 0.0002 | 28.0 | 6300 | 0.8284 | 0.89 |
| 0.0001 | 29.0 | 6525 | 0.8388 | 0.89 |
| 0.0001 | 30.0 | 6750 | 0.8305 | 0.9 |
| 0.0001 | 31.0 | 6975 | 0.8377 | 0.88 |
| 0.0153 | 32.0 | 7200 | 0.8496 | 0.88 |
| 0.0001 | 33.0 | 7425 | 0.8381 | 0.88 |
| 0.0001 | 34.0 | 7650 | 0.8440 | 0.88 |
| 0.0001 | 35.0 | 7875 | 0.8458 | 0.88 |
Framework versions
- Transformers 4.29.2
- Pytorch 1.13.1+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 16