Instructions to use derickio/chess-gpt-4.5M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use derickio/chess-gpt-4.5M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="derickio/chess-gpt-4.5M")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("derickio/chess-gpt-4.5M") model = AutoModelForCausalLM.from_pretrained("derickio/chess-gpt-4.5M") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use derickio/chess-gpt-4.5M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "derickio/chess-gpt-4.5M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "derickio/chess-gpt-4.5M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/derickio/chess-gpt-4.5M
- SGLang
How to use derickio/chess-gpt-4.5M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "derickio/chess-gpt-4.5M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "derickio/chess-gpt-4.5M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "derickio/chess-gpt-4.5M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "derickio/chess-gpt-4.5M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use derickio/chess-gpt-4.5M with Docker Model Runner:
docker model run hf.co/derickio/chess-gpt-4.5M
Chess GPT-4.5M
Overview
Chess GPT-4.5M is a generative language model trained specifically to generate chess moves and analyze chess games. The model is based on the GPT architecture and was trained with a custom 32-token vocabulary reflecting key chess symbols and notations.
Model Details
- Architecture: GPT-based language model (GPT2LMHeadModel)
- Parameters: Approximately 4.5M parameters
- Layers: 8 transformer layers
- Heads: 4 attention heads per layer
- Embedding Dimension: 256
- Training Sequence Length: 1024 tokens per chess game
- Vocabulary: 32 tokens (custom vocabulary)
Training Data
The model was trained on tokenized chess game data prepared from the Lichess dataset. The preparation process involved:
- Tokenizing chess games using a custom 32-token vocabulary.
- Creating binary training files (
train.binandval.bin). - Saving vocabulary information to
meta.pkl.
Training Configuration
The training configuration, found in config/mac_chess_gpt.py, includes:
- Dataset: lichess_hf_dataset
- Batch Size: 2 (optimized for Mac's memory constraints)
- Block Size: 1023 (1024 including the positional embedding)
- Learning Rate: 3e-4
- Max Iterations: 140,000
- Device: 'mps' (Mac-specific settings)
- Other Settings: No dropout and compile set to False for Mac compatibility
How to Use
Generating Chess Moves
After fine-tuning, use the generation script to sample chess moves. Example commands: bash Sample from the model without a provided prompt: python sample.py --out_dir=out-chess-mac Generate a chess game sequence starting with a custom prompt: python sample.py --out_dir=out-chess-mac --start=";1.e4"
Loading the Model in Transformers
Once the model card and converted model files are pushed to the Hugging Face Hub, you can load the model using:
python from transformers import GPT2LMHeadModel, GPT2Tokenizer model = GPT2LMHeadModel.from_pretrained("your-hf-username/chess-gpt-4.5M") tokenizer = GPT2Tokenizer.from_pretrained("your-hf-username/chess-gpt-4.5M")
Note: The tokenizer uses a custom vocabulary provided in vocab.json.
Intended Use
The model is intended for:
- Generating chess move sequences.
- Assisting in automated chess analysis.
- Educational purposes in understanding language model training on specialized domains.
Limitations
- The model is trained on a relatively small (4.5M parameter) architecture and may not capture extremely complex chess strategies.
- It is specialized on chess move generation and may not generalize to standard language tasks.
Training Process Summary
- Data Preparation: Tokenized the Lichess chess game dataset using a 32-token vocabulary.
- Model Training: Used custom training configurations specified in
config/mac_chess_gpt.py. - Model Conversion: Converted added checkpoint from
out-chess-mac/ckpt.ptinto a Hugging Face compatible format withconvert_to_hf.py. - Repository Setup: Pushed the converted model files (including custom tokenizer vocab) to the Hugging Face Hub with Git LFS handling large files.
Acknowledgements
This model was developed following inspiration from GPT-2 and adapted for the chess domain.
- Downloads last month
- 11
Model tree for derickio/chess-gpt-4.5M
Base model
openai-community/gpt2