Sentence Similarity
sentence-transformers
Safetensors
Hungarian
modernbert
feature-extraction
Generated from Trainer
dataset_size:1207229
loss:MultipleNegativesRankingLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use karsar/ModernBERT-base-hu_v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use karsar/ModernBERT-base-hu_v3 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("karsar/ModernBERT-base-hu_v3") sentences = [ "a fara név azt jelenti", "Utcafesztiváltól, egyhetes fesztiválon át havas fesztiválig! Ezek a fesztiválok lehetőséget kínálnak arra, hogy egy helyszínen megkóstolják a Bend sörfőzdék kínálatát – valójában több helyszínen egész évben! Kóstoljon meg egy jó főzetet, hallgasson néhány jó dallamot, és találkozzon a sörfőzőkkel! Itt a sör ideje!", "Fara /fara/ [2 szótag.] mint lánynév középangol és arab eredetű, a Fara név jelentése kedves, kellemes. A Fara a Farrah (közép angol, arab) változata: az angol fair szóból származik. Hasonlítsa össze a Fura vezetéknevet.", "A Fara név angolul azt jelenti: utazó. A Fara név angol névből származik. A Fara nevet leggyakrabban lánynévként vagy női névként használják." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Inference
- Notebooks
- Google Colab
- Kaggle
| { | |
| "added_tokens_decoder": { | |
| "0": { | |
| "content": "|||IP_ADDRESS|||", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "1": { | |
| "content": "<|padding|>", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "50254": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50255": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50256": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50257": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50258": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50259": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50260": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50261": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50262": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50263": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50264": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50265": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50266": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50267": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50268": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50269": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50270": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50271": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50272": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50273": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50274": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50275": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50276": { | |
| "content": " ", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50277": { | |
| "content": "|||EMAIL_ADDRESS|||", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50278": { | |
| "content": "|||PHONE_NUMBER|||", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50279": { | |
| "content": "<|endoftext|>", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "50280": { | |
| "content": "[UNK]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "50281": { | |
| "content": "[CLS]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "50282": { | |
| "content": "[SEP]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "50283": { | |
| "content": "[PAD]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "50284": { | |
| "content": "[MASK]", | |
| "lstrip": true, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "50285": { | |
| "content": "[unused0]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50286": { | |
| "content": "[unused1]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50287": { | |
| "content": "[unused2]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50288": { | |
| "content": "[unused3]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50289": { | |
| "content": "[unused4]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50290": { | |
| "content": "[unused5]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50291": { | |
| "content": "[unused6]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50292": { | |
| "content": "[unused7]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50293": { | |
| "content": "[unused8]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50294": { | |
| "content": "[unused9]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50295": { | |
| "content": "[unused10]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50296": { | |
| "content": "[unused11]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50297": { | |
| "content": "[unused12]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50298": { | |
| "content": "[unused13]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50299": { | |
| "content": "[unused14]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50300": { | |
| "content": "[unused15]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50301": { | |
| "content": "[unused16]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50302": { | |
| "content": "[unused17]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50303": { | |
| "content": "[unused18]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50304": { | |
| "content": "[unused19]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50305": { | |
| "content": "[unused20]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50306": { | |
| "content": "[unused21]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50307": { | |
| "content": "[unused22]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50308": { | |
| "content": "[unused23]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50309": { | |
| "content": "[unused24]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50310": { | |
| "content": "[unused25]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50311": { | |
| "content": "[unused26]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50312": { | |
| "content": "[unused27]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50313": { | |
| "content": "[unused28]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50314": { | |
| "content": "[unused29]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50315": { | |
| "content": "[unused30]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50316": { | |
| "content": "[unused31]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50317": { | |
| "content": "[unused32]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50318": { | |
| "content": "[unused33]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50319": { | |
| "content": "[unused34]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50320": { | |
| "content": "[unused35]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50321": { | |
| "content": "[unused36]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50322": { | |
| "content": "[unused37]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50323": { | |
| "content": "[unused38]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50324": { | |
| "content": "[unused39]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50325": { | |
| "content": "[unused40]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50326": { | |
| "content": "[unused41]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50327": { | |
| "content": "[unused42]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50328": { | |
| "content": "[unused43]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50329": { | |
| "content": "[unused44]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50330": { | |
| "content": "[unused45]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50331": { | |
| "content": "[unused46]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50332": { | |
| "content": "[unused47]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50333": { | |
| "content": "[unused48]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50334": { | |
| "content": "[unused49]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50335": { | |
| "content": "[unused50]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50336": { | |
| "content": "[unused51]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50337": { | |
| "content": "[unused52]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50338": { | |
| "content": "[unused53]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50339": { | |
| "content": "[unused54]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50340": { | |
| "content": "[unused55]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50341": { | |
| "content": "[unused56]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50342": { | |
| "content": "[unused57]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50343": { | |
| "content": "[unused58]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50344": { | |
| "content": "[unused59]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50345": { | |
| "content": "[unused60]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50346": { | |
| "content": "[unused61]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50347": { | |
| "content": "[unused62]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50348": { | |
| "content": "[unused63]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50349": { | |
| "content": "[unused64]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50350": { | |
| "content": "[unused65]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50351": { | |
| "content": "[unused66]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50352": { | |
| "content": "[unused67]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50353": { | |
| "content": "[unused68]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50354": { | |
| "content": "[unused69]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50355": { | |
| "content": "[unused70]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50356": { | |
| "content": "[unused71]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50357": { | |
| "content": "[unused72]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50358": { | |
| "content": "[unused73]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50359": { | |
| "content": "[unused74]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50360": { | |
| "content": "[unused75]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50361": { | |
| "content": "[unused76]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50362": { | |
| "content": "[unused77]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50363": { | |
| "content": "[unused78]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50364": { | |
| "content": "[unused79]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50365": { | |
| "content": "[unused80]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50366": { | |
| "content": "[unused81]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| }, | |
| "50367": { | |
| "content": "[unused82]", | |
| "lstrip": false, | |
| "normalized": true, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": false | |
| } | |
| }, | |
| "clean_up_tokenization_spaces": true, | |
| "cls_token": "[CLS]", | |
| "extra_special_tokens": {}, | |
| "mask_token": "[MASK]", | |
| "model_input_names": [ | |
| "input_ids", | |
| "attention_mask" | |
| ], | |
| "model_max_length": 8192, | |
| "pad_token": "[PAD]", | |
| "sep_token": "[SEP]", | |
| "tokenizer_class": "PreTrainedTokenizerFast", | |
| "unk_token": "[UNK]" | |
| } | |