Instructions to use Richard1999/Virgo-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Richard1999/Virgo-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Richard1999/Virgo-7B")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Richard1999/Virgo-7B") model = AutoModelForImageTextToText.from_pretrained("Richard1999/Virgo-7B") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Richard1999/Virgo-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Richard1999/Virgo-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Richard1999/Virgo-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Richard1999/Virgo-7B
- SGLang
How to use Richard1999/Virgo-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Richard1999/Virgo-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Richard1999/Virgo-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Richard1999/Virgo-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Richard1999/Virgo-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Richard1999/Virgo-7B with Docker Model Runner:
docker model run hf.co/Richard1999/Virgo-7B
Model Card for Model ID
Quick Start
from vllm import LLM, SamplingParams
from PIL import Image
model_name = "Richard1999/Virgo-7B"
placeholder = "<|image_pad|>"
llm = LLM(
model=model_name,
trust_remote_code=True,
tensor_parallel_size=8,
)
question = "Please first think deeply about the question, and then put the final answer in \\boxed{}.\nIn the diagram, $\\angle E A D=90^{\\circ}, \\angle A C D=90^{\\circ}$, and $\\angle A B C=90^{\\circ}$. Also, $E D=13, E A=12$, $D C=4$, and $C B=2$. Determine the length of $A B$."
prompt = ("<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n"
f"<|im_start|>user\n<|vision_start|>{placeholder}<|vision_end|>"
f"{question}<|im_end|>\n"
"<|im_start|>assistant\n")
stop_token_ids = None
sampling_params = SamplingParams(
temperature=0.0,
top_k=1,
top_p=1.0,
stop_token_ids=stop_token_ids,
repetition_penalty=1.05,
max_tokens=8192
)
image = Image.open("case/2246_image_1.jpg")
inputs = {
"prompt": prompt,
"multi_modal_data": {
"image": image
},
}
outputs = llm.generate(inputs, sampling_params)
print(outputs[0].outputs[0].text)
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support