Instructions to use Fu01978/gpt2-mega-wiki-logic with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Fu01978/gpt2-mega-wiki-logic with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Fu01978/gpt2-mega-wiki-logic")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Fu01978/gpt2-mega-wiki-logic") model = AutoModelForCausalLM.from_pretrained("Fu01978/gpt2-mega-wiki-logic") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Fu01978/gpt2-mega-wiki-logic with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Fu01978/gpt2-mega-wiki-logic" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Fu01978/gpt2-mega-wiki-logic", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Fu01978/gpt2-mega-wiki-logic
- SGLang
How to use Fu01978/gpt2-mega-wiki-logic with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Fu01978/gpt2-mega-wiki-logic" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Fu01978/gpt2-mega-wiki-logic", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Fu01978/gpt2-mega-wiki-logic" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Fu01978/gpt2-mega-wiki-logic", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Fu01978/gpt2-mega-wiki-logic with Docker Model Runner:
docker model run hf.co/Fu01978/gpt2-mega-wiki-logic
Model Card for Fu01978/gpt2-mega-wiki-logic
This model is a fine-tuned version of GPT-2 (Base) trained on a diverse "mini-pile" of 14 datasets ranging from French history and Peruvian law to programming concepts and esoteric texts. It is designed to be a versatile text-completer that can shift styles based on the input prompt.
Model Details
Model Description
This model was created to explore the limits of "Knowledge Density" in small language models. By mixing high-fact density data (Wikipedia, News) with specialized technical data (Programming, Law) and historical texts, the model acts as a "Jack-of-all-trades" completion engine.
- Developed by: Fu01978
- Model type: Causal Language Model (Transformer Decoder)
- Language(s) (NLP): English (Primary), Spanish (Programming/Law)
- License: MIT
- Finetuned from model: openai-community/gpt2
Uses
Direct Use
The model is best used for constrained text completion. It excels when given a clear context or "trigger phrase" to help it navigate its diverse training data.
Out-of-Scope Use
- Fact-Checking: This model should not be used as a primary source of historical or legal facts.
- High-Stakes Advice: Do not use for legal or medical decision-making.
Bias, Risks, and Limitations
- Temporal Hallucinations: Because the model was trained on 14 vastly different time periods simultaneously, it frequently mixes historical facts (e.g., placing the 1944 Battle of the Bulge in the 1898 Spanish-American War).
- Small Context Window: As a GPT-2 base model, it has a limited context window and may lose coherence in long-form generation.
Recommendations
Users should use Beam Search ($num_beams \ge 3$) and a high Repetition Penalty ($1.5+$) to prevent the model from entering logic loops or mixing unrelated datasets.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import pipeline
generator = pipeline("text-generation", model="Fu01978/gpt2-mega-wiki-logic")
prompt = "In the Python programming language, a decorator is"
print(generator(prompt, max_new_tokens=50, repetition_penalty=1.5, num_beams=5)[0]['generated_text'])
Training Details
Training Data
The model was trained on a combined pool of 123233 rows.
Training Metrics
| Step | Training Loss |
|---|---|
| 100 | 3.0273 |
| 200 | 2.8873 |
| 300 | 2.7780 |
| 400 | 2.8189 |
| 500 | 2.8553 |
Training Procedure
Training Hyperparameters
- Steps: 500
- Batch Size: 4 (with 4 Gradient Accumulation steps)
- Learning Rate: 4e-5
- Precision: fp16 (Mixed Precision)
- Optimizer: AdamW with Weight Decay (0.01)
Technical Specifications
Model Architecture and Objective
- Architecture: Standard GPT-2 Transformer Decoder.
- Objective: Causal Language Modeling (Next-token prediction).
- Parameters: 124 Million.
- Downloads last month
- 4
Model tree for Fu01978/gpt2-mega-wiki-logic
Base model
openai-community/gpt2