authorship_model

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct on the authorship_train dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5754

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 20
  • num_epochs: 1.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.3515 0.0486 200 0.7383
1.2685 0.0972 400 0.7080
1.2484 0.1458 600 0.6671
1.2335 0.1945 800 0.6473
1.1783 0.2431 1000 0.6368
1.1715 0.2917 1200 0.6287
1.2187 0.3403 1400 0.6172
1.1708 0.3889 1600 0.6101
1.1623 0.4375 1800 0.6024
1.1197 0.4861 2000 0.6001
1.1420 0.5348 2200 0.5934
1.1861 0.5834 2400 0.5901
1.1843 0.6320 2600 0.5874
1.1776 0.6806 2800 0.5833
1.1156 0.7292 3000 0.5814
1.1301 0.7778 3200 0.5785
1.1356 0.8264 3400 0.5773
1.1167 0.8751 3600 0.5761
1.1620 0.9237 3800 0.5755
1.1024 0.9723 4000 0.5754

Framework versions

  • PEFT 0.18.1
  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
276
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AhmedZaky1/authorship_model

Adapter
(817)
this model