Instructions to use microsoft/trocr-base-printed with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/trocr-base-printed with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="microsoft/trocr-base-printed")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("microsoft/trocr-base-printed") model = AutoModelForImageTextToText.from_pretrained("microsoft/trocr-base-printed") - Notebooks
- Google Colab
- Kaggle
Finetuning
#11
by whoami02 - opened
How to finetune this model to read special symbols like Ø, °, ± with numbers?
I followed the notebooks given and also added these characters in tokenizer vocabulary. Yet my model always return empty string. Baseline model used to read numbers but mess up those symbols, now its not even that. I feel that its training issue but don't know how to fix it.
Can someone help?