Instructions to use Deci/DeciLM-6b-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Deci/DeciLM-6b-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Deci/DeciLM-6b-instruct", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("Deci/DeciLM-6b-instruct", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Deci/DeciLM-6b-instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Deci/DeciLM-6b-instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Deci/DeciLM-6b-instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Deci/DeciLM-6b-instruct
- SGLang
How to use Deci/DeciLM-6b-instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Deci/DeciLM-6b-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Deci/DeciLM-6b-instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Deci/DeciLM-6b-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Deci/DeciLM-6b-instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Deci/DeciLM-6b-instruct with Docker Model Runner:
docker model run hf.co/Deci/DeciLM-6b-instruct
TypeError: DeciLMAttention.forward() got an unexpected keyword argument 'padding_mask'
python server.py --api --listen --trust-remote-code --disk-cache-dir /data/tmp --use_double_quant --quant_type nf4 --load-in-4bit --settings settings-template.yaml --model models/DeciLM-6b-instruct/
To create a public link, set share=True in launch().
Traceback (most recent call last):
File "/home/user/text-generation-webui/modules/callbacks.py", line 56, in gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "/home/user/text-generation-webui/modules/text_generation.py", line 347, in generate_with_callback
shared.model.generate(**kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/generation/utils.py", line 1652, in generate
return self.sample(
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2734, in sample
outputs = self(
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1034, in forward
outputs = self.model(
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 921, in forward
layer_outputs = decoder_layer(
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 631, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
TypeError: DeciLMAttention.forward() got an unexpected keyword argument 'padding_mask'
Output generated in 0.28 seconds (0.00 tokens/s, 0 tokens, context 207, seed 880665434)
This is due to a change in transformers latest version, we will push a fix shortly. Meanwhile you can 'pip install transformers==4.31.0' if you want, no issue there
2023-09-30 14:59:26 WARNING:trust_remote_code is enabled. This is dangerous.
2023-09-30 14:59:26 WARNING:
You are potentially exposing the web UI to the entire internet without any access password.
You can create one with the "--gradio-auth" flag like this:
--gradio-auth username:password
Make sure to replace username:password with your own.
Traceback (most recent call last):
File "/home/user/text-generation-webui/server.py", line 30, in
from modules import (
File "/home/user/text-generation-webui/modules/chat.py", line 18, in
from modules.text_generation import (
File "/home/user/text-generation-webui/modules/text_generation.py", line 23, in
from modules.models import clear_torch_cache, local_rank
File "/home/user/text-generation-webui/modules/models.py", line 11, in
from transformers import (
ImportError: cannot import name 'GPTQConfig' from 'transformers' (/home/user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/init.py)
Hey Joshua
We've pushed a fix to support the latest transformers dev version
My bad. I thought I commented but after updating to latest transformers. The issue went away