6ded5be0b5bd3b983c2c5cb6e6ee578e
This model is a fine-tuned version of distilbert/distilbert-base-german-cased on the contemmcm/cls_20newsgroups dataset. It achieves the following results on the evaluation set:
- Loss: 0.6007
- Data Size: 1.0
- Epoch Runtime: 16.3879
- Accuracy: 0.8692
- F1 Macro: 0.8660
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Accuracy | F1 Macro |
|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 3.0066 | 0 | 1.6857 | 0.0507 | 0.0052 |
| No log | 1 | 499 | 3.0064 | 0.0078 | 2.1331 | 0.0527 | 0.0091 |
| 0.0301 | 2 | 998 | 3.0007 | 0.0156 | 1.9626 | 0.0718 | 0.0260 |
| 0.0542 | 3 | 1497 | 2.9875 | 0.0312 | 2.2697 | 0.0680 | 0.0251 |
| 0.1019 | 4 | 1996 | 2.6253 | 0.0625 | 2.7208 | 0.1895 | 0.1237 |
| 2.2206 | 5 | 2495 | 1.7800 | 0.125 | 3.6829 | 0.4390 | 0.3664 |
| 1.363 | 6 | 2994 | 1.1989 | 0.25 | 5.5516 | 0.6182 | 0.5902 |
| 0.9345 | 7 | 3493 | 0.7742 | 0.5 | 9.1053 | 0.7510 | 0.7496 |
| 0.5742 | 8.0 | 3992 | 0.5342 | 1.0 | 16.1643 | 0.8311 | 0.8299 |
| 0.3926 | 9.0 | 4491 | 0.5097 | 1.0 | 16.2041 | 0.8427 | 0.8412 |
| 0.2894 | 10.0 | 4990 | 0.5172 | 1.0 | 16.5160 | 0.8523 | 0.8521 |
| 0.1945 | 11.0 | 5489 | 0.4764 | 1.0 | 16.2864 | 0.8614 | 0.8595 |
| 0.1595 | 12.0 | 5988 | 0.5472 | 1.0 | 16.2194 | 0.8634 | 0.8633 |
| 0.1502 | 13.0 | 6487 | 0.5436 | 1.0 | 15.9684 | 0.8679 | 0.8683 |
| 0.1328 | 14.0 | 6986 | 0.5681 | 1.0 | 16.3111 | 0.8662 | 0.8647 |
| 0.133 | 15.0 | 7485 | 0.6007 | 1.0 | 16.3879 | 0.8692 | 0.8660 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.3.0
- Tokenizers 0.22.1
- Downloads last month
- 3
Model tree for contemmcm/6ded5be0b5bd3b983c2c5cb6e6ee578e
Base model
distilbert/distilbert-base-german-cased