Instructions to use microsoft/Phi-4-multimodal-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/Phi-4-multimodal-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="microsoft/Phi-4-multimodal-instruct", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-4-multimodal-instruct", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Convert to HF format
This PR converts the model to HF format, while keeping all compatibility with trust_remote_code=True (the .py files uploaded are copies of those in Transformers, with some additional stuff to ensure retro-compatiblity to at least v4.49.0 included, which was the state of the library when this model was released).
It is self-contained, and adds in one-place everything that is needed, including the model, config and processor, thus superseding all precedent PRs from us!
Hi @cyrilvallez , thank you for your work! may I know if you still plan to merge this PR? Is this branch currently verified working?
Hey! Merging is not up to me, but to the repo owners 🤗 You can still use this branch in Transformers with revision="refs/pr/70" in all from_pretrained methods in Transformers, which will work without needing to add trust_remote_code=True
Hi @nguyenbh , could you help to merge this PR? @cyrilvallez and me are from HuggingFace transformers team.
Hi, @cyrilvallez After I use this branch in Transformers with revision="refs/pr/70" in all from_pretrained methods in Transformers(still need add trust_remote_code=True somehow), it returns error: ImportError: cannot import name 'CommonKwargs' from 'transformers.processing_utils' (/root/upstream/transformers/src/transformers/processing_utils.py), CommonKwargs is not used in latest transformers, and after removing it in this PR, the bug is fixed. So can you help update the code here?
I checked the full refs/pr/70 remote processor code locally. CommonKwargs appears safe to remove because it only contributes return_tensors, and return_tensors is already handled through the normal forwarded processor kwargs path in apply_chat_template() / call(). I don’t see a runtime path here that depends specifically on CommonKwargs.
Hi @nguyenbh , could you help to merge this PR? @cyrilvallez and me are from HuggingFace transformers team.