heegyu/kowikitext
Viewer β’ Updated β’ 1.33M β’ 110 β’ 6
How to use beomi/Mistral-Ko-Inst-dev with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="beomi/Mistral-Ko-Inst-dev")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM
tokenizer = AutoTokenizer.from_pretrained("beomi/Mistral-Ko-Inst-dev")
model = AutoModelForMultimodalLM.from_pretrained("beomi/Mistral-Ko-Inst-dev")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use beomi/Mistral-Ko-Inst-dev with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "beomi/Mistral-Ko-Inst-dev"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "beomi/Mistral-Ko-Inst-dev",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/beomi/Mistral-Ko-Inst-dev
How to use beomi/Mistral-Ko-Inst-dev with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "beomi/Mistral-Ko-Inst-dev" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "beomi/Mistral-Ko-Inst-dev",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "beomi/Mistral-Ko-Inst-dev" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "beomi/Mistral-Ko-Inst-dev",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use beomi/Mistral-Ko-Inst-dev with Docker Model Runner:
docker model run hf.co/beomi/Mistral-Ko-Inst-dev
Experimental Repository :)
Contents will updated without any notice at all. If you plan to use this repository, please use with revision with git hash.
This experiment is aimed to:
Here's some test:
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
'beomi/Mistral-Ko-Inst-dev',
torch_dtype='auto',
device_map='auto',
)
tokenizer = AutoTokenizer.from_pretrained('beomi/Mistral-Ko-Inst-dev')
pipe = pipeline(
'text-generation',
model=model,
tokenizer=tokenizer,
do_sample=True,
max_new_tokens=350,
return_full_text=False,
no_repeat_ngram_size=6,
eos_token_id=1, # not yet tuned to gen </s>, use <s> instead.
)
def gen(x):
chat = tokenizer.apply_chat_template([
{"role": "user", "content": x},
# {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
# {"role": "user", "content": "Do you have mayonnaise recipes? please say in Korean."}
], tokenize=False)
print(pipe(chat)[0]['generated_text'].strip())
gen("μ€νλ²
μ€μ μ€νλ²
μ€ μ½λ¦¬μμ μ°¨μ΄λ?")
# (μμ± μμ)
# μ€νλ²
μ€λ μ μΈκ³μ μΌλ‘ μ΄μνκ³ μλ μ»€νΌ μ λ¬Έμ¬μ΄λ€. νκ΅μλ μ€νλ²
μ€ μ½λ¦¬μλΌλ μ΄λ¦μΌλ‘ μ΄μλκ³ μλ€.
# μ€νλ²
μ€ μ½λ¦¬μλ λνλ―Όκ΅μ μ
μ ν μ΄ν 2009λ
κ³Ό 2010λ
μ λ μ°¨λ‘μ λΈλλκ³Όμ μ¬κ²ν λ° μλ‘μ΄ λμμΈμ ν΅ν΄ μλ‘μ΄ λΈλλλ€. μ»€νΌ μ λ¬Έμ ν리미μ μ΄λ―Έμ§λ₯Ό μ μ§νκ³ μκ³ , μ€νλ²
μ€ μ½λ¦¬μλ νκ΅μ λννλ ν리미μ μ»€νΌ μ λ¬Έ λΈλλμ λ§λ€κ³ μλ€.