Automatic Speech Recognition
Transformers
Safetensors
English
whisper
w4a16
int4
vllm
audio
compressed-tensors
Instructions to use RedHatAI/whisper-medium-quantized.w4a16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use RedHatAI/whisper-medium-quantized.w4a16 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="RedHatAI/whisper-medium-quantized.w4a16")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("RedHatAI/whisper-medium-quantized.w4a16") model = AutoModelForSpeechSeq2Seq.from_pretrained("RedHatAI/whisper-medium-quantized.w4a16") - Notebooks
- Google Colab
- Kaggle
| DEFAULT_stage: | |
| DEFAULT_modifiers: | |
| GPTQModifier: | |
| sequential_targets: [WhisperEncoderLayer, WhisperDecoderLayer] | |
| dampening_frac: 0.01 | |
| config_groups: | |
| config_group: | |
| targets: [Linear] | |
| weights: {num_bits: 4, type: int, symmetric: true, group_size: 64, strategy: group, | |
| dynamic: false, actorder: weight, observer: minmax} | |
| targets: Linear | |
| ignore: ['re:.*lm_head'] | |