Instructions to use Fu01978/gpt2-mega-wiki-logic with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Fu01978/gpt2-mega-wiki-logic with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Fu01978/gpt2-mega-wiki-logic")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Fu01978/gpt2-mega-wiki-logic")
model = AutoModelForCausalLM.from_pretrained("Fu01978/gpt2-mega-wiki-logic")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Fu01978/gpt2-mega-wiki-logic with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Fu01978/gpt2-mega-wiki-logic"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Fu01978/gpt2-mega-wiki-logic",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Fu01978/gpt2-mega-wiki-logic

SGLang

How to use Fu01978/gpt2-mega-wiki-logic with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Fu01978/gpt2-mega-wiki-logic" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Fu01978/gpt2-mega-wiki-logic",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Fu01978/gpt2-mega-wiki-logic" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Fu01978/gpt2-mega-wiki-logic",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Fu01978/gpt2-mega-wiki-logic with Docker Model Runner:
```
docker model run hf.co/Fu01978/gpt2-mega-wiki-logic
```

Model Card for Fu01978/gpt2-mega-wiki-logic

This model is a fine-tuned version of GPT-2 (Base) trained on a diverse "mini-pile" of 14 datasets ranging from French history and Peruvian law to programming concepts and esoteric texts. It is designed to be a versatile text-completer that can shift styles based on the input prompt.

Model Details

Model Description

This model was created to explore the limits of "Knowledge Density" in small language models. By mixing high-fact density data (Wikipedia, News) with specialized technical data (Programming, Law) and historical texts, the model acts as a "Jack-of-all-trades" completion engine.

Developed by: Fu01978
Model type: Causal Language Model (Transformer Decoder)
Language(s) (NLP): English (Primary), Spanish (Programming/Law)
License: MIT
Finetuned from model: openai-community/gpt2

Uses

Direct Use

The model is best used for constrained text completion. It excels when given a clear context or "trigger phrase" to help it navigate its diverse training data.

Out-of-Scope Use

Fact-Checking: This model should not be used as a primary source of historical or legal facts.
High-Stakes Advice: Do not use for legal or medical decision-making.

Bias, Risks, and Limitations

Temporal Hallucinations: Because the model was trained on 14 vastly different time periods simultaneously, it frequently mixes historical facts (e.g., placing the 1944 Battle of the Bulge in the 1898 Spanish-American War).
Small Context Window: As a GPT-2 base model, it has a limited context window and may lose coherence in long-form generation.

Recommendations

Users should use Beam Search ($num_beams \ge 3$) and a high Repetition Penalty ($1.5+$) to prevent the model from entering logic loops or mixing unrelated datasets.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import pipeline

generator = pipeline("text-generation", model="Fu01978/gpt2-mega-wiki-logic")
prompt = "In the Python programming language, a decorator is"
print(generator(prompt, max_new_tokens=50, repetition_penalty=1.5, num_beams=5)[0]['generated_text'])

Training Details

Training Data

The model was trained on a combined pool of 123233 rows.

Training Metrics

Step	Training Loss
100	3.0273
200	2.8873
300	2.7780
400	2.8189
500	2.8553

Training Procedure

Training Hyperparameters

Steps: 500
Batch Size: 4 (with 4 Gradient Accumulation steps)
Learning Rate: 4e-5
Precision: fp16 (Mixed Precision)
Optimizer: AdamW with Weight Decay (0.01)

Technical Specifications

Model Architecture and Objective

Architecture: Standard GPT-2 Transformer Decoder.
Objective: Causal Language Modeling (Next-token prediction).
Parameters: 124 Million.

Downloads last month: 4

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Fu01978/gpt2-mega-wiki-logic

Base model

openai-community/gpt2

Finetuned

(2163)

this model

Datasets used to train Fu01978/gpt2-mega-wiki-logic

Collection including Fu01978/gpt2-mega-wiki-logic

Small Models

Collection

A list of all small models (=<1B) that I have published. • 9 items • Updated Mar 2