HuggingFaceTB/smollm-corpus
Viewer • Updated • 237M • 51.9k • 459
Blueberry-Nano is a GPT-1 Level 151M parameter language model trained from scratch as part of the 5-Dollar-LLM project. This model is for learning of LLM training process. It can't compete with today's LLMs (yet).
Final metrics after 1B tokens:
This model is a base model trained on a mix of educational data. It demonstrates reasonable storytelling and factual knowledge for its size, but may hallucinate and is not yet fine-tuned for instruction following.
This model (151M parameters) reached similar complexity to OpenAI's original GPT-1 (117M) in under 3 hours on a single consumer GPU, showcasing the massive improvement in training efficiency in recent years.
Created by the Open Superintelligence Lab.