Instructions to use mrm8488/bert-spanish-cased-finetuned-pos-syntax with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mrm8488/bert-spanish-cased-finetuned-pos-syntax with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="mrm8488/bert-spanish-cased-finetuned-pos-syntax")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("mrm8488/bert-spanish-cased-finetuned-pos-syntax") model = AutoModelForTokenClassification.from_pretrained("mrm8488/bert-spanish-cased-finetuned-pos-syntax") - Notebooks
- Google Colab
- Kaggle
Spanish BERT (BETO) + Syntax POS tagging βπ·
This model is a fine-tuned version of the Spanish BERT (BETO) on Spanish syntax annotations in CONLL CORPORA dataset for syntax POS (Part of Speech tagging) downstream task.
Details of the downstream task (Syntax POS) - Dataset
Fine-tune script on NER dataset provided by Huggingface
21 Syntax annotations (Labels) covered:
- _
- ATR
- ATR.d
- CAG
- CC
- CD
- CD.Q
- CI
- CPRED
- CPRED.CD
- CPRED.SUJ
- CREG
- ET
- IMPERS
- MOD
- NEG
- PASS
- PUNC
- ROOT
- SUJ
- VOC
Metrics on test set π
| Metric | # score |
|---|---|
| F1 | 89.27 |
| Precision | 89.44 |
| Recall | 89.11 |
Model in action π¨
Fast usage with pipelines π§ͺ
from transformers import pipeline
nlp_pos_syntax = pipeline(
"ner",
model="mrm8488/bert-spanish-cased-finetuned-pos-syntax",
tokenizer="mrm8488/bert-spanish-cased-finetuned-pos-syntax"
)
text = 'Mis amigos estΓ‘n pensando viajar a Londres este verano.'
nlp_pos_syntax(text)[1:len(nlp_pos_syntax(text))-1]
[
{ "entity": "_", "score": 0.9999216794967651, "word": "Mis" },
{ "entity": "SUJ", "score": 0.999882698059082, "word": "amigos" },
{ "entity": "_", "score": 0.9998869299888611, "word": "estΓ‘n" },
{ "entity": "ROOT", "score": 0.9980518221855164, "word": "pensando" },
{ "entity": "_", "score": 0.9998420476913452, "word": "viajar" },
{ "entity": "CD", "score": 0.999351978302002, "word": "a" },
{ "entity": "_", "score": 0.999959409236908, "word": "Londres" },
{ "entity": "_", "score": 0.9998968839645386, "word": "este" },
{ "entity": "CC", "score": 0.99931401014328, "word": "verano" },
{ "entity": "PUNC", "score": 0.9998534917831421, "word": "." }
]
Created by Manuel Romero/@mrm8488
Made with β₯ in Spain
- Downloads last month
- 7