Whisper child-adult training data ratios for child ASR
Collection
Models that have all been trained with 30 hours of speech, but using different ratios of child-adult speech
•
5 items
•
Updated
This model is a fine-tuned version of openai/whisper-large-v2 on the JASMIN-CGN dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 1.0406 | 0.1078 | 25 | 1.2208 | 38.0682 |
| 1.0219 | 0.2155 | 50 | 1.1918 | 37.6556 |
| 0.9715 | 0.3233 | 75 | 1.1313 | 36.7397 |
| 0.9105 | 0.4310 | 100 | 1.0425 | 35.4715 |
| 0.8408 | 0.5388 | 125 | 0.9366 | 34.8576 |
| 0.7305 | 0.6466 | 150 | 0.8256 | 32.5058 |
| 0.6843 | 0.7543 | 175 | 0.7052 | 31.1974 |
| 0.5954 | 0.8621 | 200 | 0.6055 | 28.8187 |
| 0.5398 | 0.9698 | 225 | 0.5391 | 25.5074 |
| 0.5099 | 1.0776 | 250 | 0.4902 | 23.0952 |
| 0.4845 | 1.1853 | 275 | 0.4555 | 22.3236 |
| 0.4858 | 1.2931 | 300 | 0.4344 | 21.1897 |
| 0.4741 | 1.4009 | 325 | 0.4224 | 20.9213 |
| 0.4589 | 1.5086 | 350 | 0.4143 | 20.1161 |
| 0.4294 | 1.6164 | 375 | 0.4081 | 20.6428 |
| 0.426 | 1.7241 | 400 | 0.4027 | 21.2467 |
| 0.406 | 1.8319 | 425 | 0.3984 | 20.2469 |
| 0.4443 | 1.9397 | 450 | 0.3948 | 19.9416 |
| 0.4351 | 2.0474 | 475 | 0.3920 | 20.7804 |
| 0.4394 | 2.1552 | 500 | 0.3897 | 21.1393 |
| 0.4167 | 2.2629 | 525 | 0.3874 | 19.6196 |
| 0.3827 | 2.3707 | 550 | 0.3855 | 19.3981 |
| 0.4164 | 2.4784 | 575 | 0.3842 | 19.1767 |
| 0.4046 | 2.5862 | 600 | 0.3830 | 18.9855 |
| 0.4196 | 2.6940 | 625 | 0.3821 | 18.9184 |
| 0.4008 | 2.8017 | 650 | 0.3814 | 19.0861 |
| 0.3902 | 2.9095 | 675 | 0.3811 | 19.0794 |
Base model
openai/whisper-large-v2