Integrate with Sentence Transformers v5.4

#1
by tomaarsen HF Staff - opened

Hello!

Pull Request overview

  • Integrate this model with Sentence Transformers v5.4+ so it can be loaded via CrossEncoder("ContextualAI/ctxl-rerank-v2-instruct-multilingual-2b").

Details

This is the 2B sibling PR to https://huggingface.co/ContextualAI/ctxl-rerank-v2-instruct-multilingual-1b/discussions/2, with the same changes in place.

import torch
from sentence_transformers import CrossEncoder

model = CrossEncoder("ContextualAI/ctxl-rerank-v2-instruct-multilingual-2b", model_kwargs={"dtype": torch.bfloat16}, revision="refs/pr/1")

query = "What are the health benefits of exercise?"
instruction = "Prioritize recent medical research"
documents = [
    "Regular exercise reduces risk of heart disease and improves mental health.",
    "A 2024 study shows exercise enhances cognitive function in older adults.",
    "Ancient Greeks valued physical fitness for military training.",
]

pairs = [(query, doc) for doc in documents]
scores = model.predict(pairs, prompt=instruction)
print(scores)
# [-2.484375  0.828125 -9.3125  ]

rankings = model.rank(query, documents, prompt=instruction)
print(rankings)
# [{'corpus_id': 1, 'score': np.float32(0.828125)}, {'corpus_id': 0, 'score': np.float32(-2.484375)}, {'corpus_id': 2, 'score': np.float32(-9.3125)}]

You can run this outright due to the revisionargument. After merging, the revision argument isn't needed anymore.

Note that none of the old behaviour is affected or changed: this only adds an additional way to run the model in a familiar and common format. The raw AutoModelForCausalLM and vLLM paths already documented in the README continue to work unchanged, and the Sentence Transformers path produces identical bfloat16 scores to those paths for every sample tested (0.0 diff vs. the README's Transformers baseline on 3/3 examples, with and without an instruction).

Happy to tweak anything you'd like changed. Please let me know if you have any questions or feedback!

  • Tom Aarsen
tomaarsen changed pull request status to open
sheshansh-ctx changed pull request status to merged

Sign up or log in to comment