wmt22-cometkiwi-da-int8

A compressed version of Unbabel/wmt22-cometkiwi-da — a reference-free machine-translation quality estimation model (source + MT only, no human reference required).

Lossless compression — zero human-Pearson loss, ~40% smaller on disk via int8 alone.

What's different from the base model

No layer pruning — all 24 XLM-R encoder layers retained. Compression comes entirely from dynamic int8 quantization + fp16 storage.
layerwise_attention rebuilt to mix only the surviving layers (embeddings + kept layer outputs).
Dynamic int8 quantization on the XLM-R encoder + fp16 storage (cast back to fp32 at load before quant). No layer pruning — all 24 encoder layers retained.

Accuracy

Benchmarked on 1200 stratified segments from RicardoRei/wmt-da-human-evaluation (reference-free, src+mt only):

Metric	This variant	Full cometkiwi
Pearson r vs human DA	0.6404	0.6402
Spearman vs human DA	0.6703	0.6698
Pearson r vs full	0.9919	1.0000
MAE vs full	0.0138	0.0000
Params	565.1M	565.1M
On-disk size	~1130 MB	~2200 MB

All variants at a glance

Variant	Pearson(human)	Pearson(full)	Size	When to use
full base	0.6402	1.0000	~2200 MB	reference quality
`-int8`	0.6404	0.9919	~1300 MB	lossless compression
`-pruned-k2`	0.6300	0.9784	~2100 MB	best-quality pruned
`-pruned-k4`	0.5642	0.8316	~2060 MB	aggressive prune
`-pruned-k4-xs`	0.5544	0.8113	~1030 MB	smallest footprint

Usage

Standalone — no gated base-model download. The repo ships everything the loader needs (hparams.yaml + state_dict.pt); the loader instantiates an empty COMET architecture via load_pretrained_weights=False and overlays the fine-tuned weights. Only the ungated microsoft/infoxlm-large tokenizer/config (~5 MB) is fetched on first load and cached.

# pip install "unbabel-comet" "setuptools<81" huggingface_hub pyyaml

from huggingface_hub import snapshot_download
import sys
folder = snapshot_download(repo_id="solailabs/wmt22-cometkiwi-da-int8")
sys.path.insert(0, folder)
from load import load_model

model = load_model(folder)
out = model.predict(
    [{{"src": "The meeting has been postponed until next week.",
       "mt":  "La réunion a été reportée à la semaine prochaine."}}],
    batch_size=8, gpus=0, progress_bar=False, num_workers=2,
)
print(out["scores"])

No HF_TOKEN required. No license acceptance on Unbabel/wmt22-cometkiwi-da needed.

Files

state_dict.pt — model weights (fp32 for -pruned-k2 / -pruned-k4, fp16 for -int8 / -pruned-k4-xs)
hparams.yaml — COMET hyper-parameters (encoder model, regressor shape, loss config)
config.json — kept/dropped layer indices, quant flag, benchmarked accuracy
load.py — drop-in standalone loader
README.md — this file

Citation

Base model: Unbabel/wmt22-cometkiwi-da by Unbabel.

@inproceedings{{rei-etal-2022-cometkiwi,
    title = "{{C}}omet{{K}}iwi: {{IST}}-{{U}}nbabel 2022 Submission for the Quality Estimation Shared Task",
    author = "Rei, Ricardo  and others",
    booktitle = "WMT 2022",
}}

Released under the same license as the base model (Apache 2.0).

Downloads last month: 6

Model tree for solailabs/wmt22-cometkiwi-da-int8

Base model

microsoft/infoxlm-large

Finetuned

Unbabel/wmt22-cometkiwi-da

Finetuned

(6)

this model