Instructions to use HelixCipher/job-posting-extractor-qwen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use HelixCipher/job-posting-extractor-qwen with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-3b-instruct-bnb-4bit") model = PeftModel.from_pretrained(base_model, "HelixCipher/job-posting-extractor-qwen") - Transformers
How to use HelixCipher/job-posting-extractor-qwen with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="HelixCipher/job-posting-extractor-qwen") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("HelixCipher/job-posting-extractor-qwen", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use HelixCipher/job-posting-extractor-qwen with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HelixCipher/job-posting-extractor-qwen" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HelixCipher/job-posting-extractor-qwen", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/HelixCipher/job-posting-extractor-qwen
- SGLang
How to use HelixCipher/job-posting-extractor-qwen with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HelixCipher/job-posting-extractor-qwen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HelixCipher/job-posting-extractor-qwen", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HelixCipher/job-posting-extractor-qwen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HelixCipher/job-posting-extractor-qwen", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use HelixCipher/job-posting-extractor-qwen with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for HelixCipher/job-posting-extractor-qwen to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for HelixCipher/job-posting-extractor-qwen to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for HelixCipher/job-posting-extractor-qwen to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="HelixCipher/job-posting-extractor-qwen", max_seq_length=2048, ) - Docker Model Runner
How to use HelixCipher/job-posting-extractor-qwen with Docker Model Runner:
docker model run hf.co/HelixCipher/job-posting-extractor-qwen
Job Posting Extractor (Qwen2.5-3B)
A fine-tuned version of Qwen2.5-3B-Instruct specialized in extracting structured JSON data from job postings. Built to replace expensive API calls for web scraping tasks.
What This Model Does
Given a job posting in markdown format, this model extracts structured JSON containing:
job_title
company
location
description
salary (when available)
requirements (when available)
Quick Start
from unsloth import FastLanguageModel
import json
# Load model from HuggingFace
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="HelixCipher/job-posting-extractor-qwen",
max_seq_length=2048,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
# Example input
job_markdown = """# Job Position
**Position:** Senior Python Developer
**Company:** TechCorp
**Location:** San Francisco, CA
## Job Description
We are looking for an experienced Python developer...
"""
# Extract JSON
messages = [
{"role": "system", "content": "You are a JSON extraction assistant. Always output ONLY valid JSON."},
{"role": "user", "content": f"Extract job fields as JSON.\n\n{job_markdown}"}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.1)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
Training Details
Base Model: unsloth/qwen2.5-3b-instruct-bnb-4bit.
Training Data: 12,000 job posting examples.
Training Approach: LoRA with Unsloth.
Fine-tuning Library: TRL (Transformer Reinforcement Learning).
Framework Versions
PEFT: 0.18.1
TRL: 0.24.0
Transformers: 4.57.6
PyTorch: 2.10.0+cu126
Use Cases
Extract job postings from scraped websites.
Convert unstructured job listings to structured JSON.
Automate data collection for job aggregators.
Replace expensive LLM API calls with local inference.
Limitations
Trained specifically on job postings - may not work well for other data types.
Works best with markdown-formatted input (similar to html2text output).
Maximum context: 2048 tokens
License & Attribution
This project is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
You are free to use, share, copy, modify, and redistribute this material for any purpose (including commercial use), provided that proper attribution is given.
Attribution requirements
Any reuse, redistribution, or derivative work must include:
The creator's name:
HelixCipherA link to the original repository:
https://github.com/HelixCipher/fine-tuning-an-local-llm-for-web-scraping
An indication of whether changes were made
A reference to the license (CC BY 4.0)
Example Attribution
This work is based on Fine-Tuning An Local LLM for Web Scraping by
HelixCipher.
Original source: https://github.com/HelixCipher/fine-tuning-an-local-llm-for-web-scraping
Licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0).
You may place this attribution in a README, documentation, credits section, or other visible location appropriate to the medium.
Full license text: https://creativecommons.org/licenses/by/4.0/
Citation
@software{job_posting_extractor,
author = {HelixCipher},
title = {Job Posting Extractor - Qwen2.5-3B Fine-tuned Model},
year = {2026},
url = {https://huggingface.co/HelixCipher/job-posting-extractor-qwen}
}
- Downloads last month
- 1