train_qnli_42_1779286680

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0523
  • Num Input Tokens Seen: 11312256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0929 0.0501 590 0.0807 571072
0.1054 0.1001 1180 0.0708 1136384
0.1201 0.1502 1770 0.0836 1703808
0.1436 0.2003 2360 0.0888 2266496
0.0749 0.2503 2950 0.0761 2827328
0.0141 0.3004 3540 0.0862 3399808
0.0051 0.3505 4130 0.0710 3963584
0.0782 0.4005 4720 0.0551 4530304
0.05 0.4506 5310 0.0634 5095424
0.0293 0.5007 5900 0.0550 5660352
0.0534 0.5507 6490 0.0558 6232896
0.0467 0.6008 7080 0.0598 6801984
0.0404 0.6509 7670 0.0556 7363968
0.0633 0.7010 8260 0.0546 7924800
0.0632 0.7510 8850 0.0540 8494720
0.1023 0.8011 9440 0.0547 9066048
0.0665 0.8512 10030 0.0526 9634624
0.0855 0.9012 10620 0.0523 10199424
0.004 0.9513 11210 0.0523 10764096

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
234
Safetensors
Model size
1B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
Input a message to start chatting with rbelanec/train_qnli_42_1779286680.

Model tree for rbelanec/train_qnli_42_1779286680

Finetuned
(1748)
this model