Hierarchical BERT
Collection
Set of BERT models with Hierarchical attention pre-trained on conversational data to process multiple utterances at once • 8 items • Updated
How to use igorktech/hier-bert-i3-mlm with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="igorktech/hier-bert-i3-mlm", trust_remote_code=True) # Load model directly
from transformers import AutoModelForMaskedLM
model = AutoModelForMaskedLM.from_pretrained("igorktech/hier-bert-i3-mlm", trust_remote_code=True, dtype="auto")This model is a fine-tuned version of HierBert on an English version of OpenSubtitles dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 2.9488 | 1.55 | 25000 | 2.7667 | 0.4935 |
| 2.4233 | 3.1 | 50000 | 2.2922 | 0.5612 |