Neo-Models
Collection
Neo β’ 9 items β’ Updated β’ 17
How to use m-a-p/neo_7b with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="m-a-p/neo_7b")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("m-a-p/neo_7b")
model = AutoModelForCausalLM.from_pretrained("m-a-p/neo_7b")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use m-a-p/neo_7b with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "m-a-p/neo_7b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "m-a-p/neo_7b",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/m-a-p/neo_7b
How to use m-a-p/neo_7b with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "m-a-p/neo_7b" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "m-a-p/neo_7b",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "m-a-p/neo_7b" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "m-a-p/neo_7b",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use m-a-p/neo_7b with Docker Model Runner:
docker model run hf.co/m-a-p/neo_7b
π€Neo-Models | π€Neo-Datasets | Github
Neo is a completely open source large language model, including code, all model weights, datasets used for training, and training details.
| Model | Describe | Download |
|---|---|---|
| neo_7b | This repository contains the base model of neo_7b | β’ π€ Hugging Face |
| neo_7b_sft_v0.1 | This repository contains the supervised fine-tuning version of the neo_7b model. | β’ π€ Hugging Face |
| neo_7b_instruct_v0.1 | This repository contains the instruction-tuned version of the neo_7b model. | β’ π€ Hugging Face |
| neo_7b_intermediate | This repo contains normal pre-training intermediate ckpts. A total of 3.7T tokens were learned at this phase. | β’ π€ Hugging Face |
| neo_7b_decay | This repo contains intermediate ckpts during the decay phase. A total of 720B tokens were learned at this phase. | β’ π€ Hugging Face |
| neo_scalinglaw_980M | This repo contains ckpts related to scalinglaw experiments | β’ π€ Hugging Face |
| neo_scalinglaw_460M | This repo contains ckpts related to scalinglaw experiments | β’ π€ Hugging Face |
| neo_scalinglaw_250M | This repo contains ckpts related to scalinglaw experiments | β’ π€ Hugging Face |
| neo_2b_general | This repo contains ckpts of 2b model trained using common domain knowledge | β’ π€ Hugging Face |
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = '<your-hf-model-path-with-tokenizer>'
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_path,
device_map="auto",
torch_dtype='auto'
).eval()
input_text = "A long, long time ago,"
input_ids = tokenizer(input_text, add_generation_prompt=True, return_tensors='pt').to(model.device)
output_ids = model.generate(**input_ids, max_new_tokens=20)
response = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(response)
@article{zhang2024mapneo,
title = {MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series},
author = {Ge Zhang and Scott Qu and Jiaheng Liu and Chenchen Zhang and Chenghua Lin and Chou Leuang Yu and Danny Pan and Esther Cheng and Jie Liu and Qunshu Lin and Raven Yuan and Tuney Zheng and Wei Pang and Xinrun Du and Yiming Liang and Yinghao Ma and Yizhi Li and Ziyang Ma and Bill Lin and Emmanouil Benetos and Huan Yang and Junting Zhou and Kaijing Ma and Minghao Liu and Morry Niu and Noah Wang and Quehry Que and Ruibo Liu and Sine Liu and Shawn Guo and Soren Gao and Wangchunshu Zhou and Xinyue Zhang and Yizhi Zhou and Yubo Wang and Yuelin Bai and Yuhan Zhang and Yuxiang Zhang and Zenith Wang and Zhenzhu Yang and Zijian Zhao and Jiajun Zhang and Wanli Ouyang and Wenhao Huang and Wenhu Chen},
year = {2024},
journal = {arXiv preprint arXiv: 2405.19327}
}