Ex0bit
/

lfm-Nanotron

Text Generation

hybrid-architecture

Model card Files Files and versions

lfm-Nanotron / README.md

Ex0bit's picture

Update README.md

c106a5a verified about 2 months ago

|

history blame contribute delete

3.14 kB

	---
	license: other
	license_name: lfm-nanotron-prism-research
	license_link: LICENSE.md
	language:
	- en
	tags:
	- lfm
	- prism
	- gspo
	- hybrid-architecture
	- tool-use
	- Thinking
	pipeline_tag: text-generation
	library_name: transformers
	---

	![image](https://huggingface.co/proxy/cdn-uploads.huggingface.co/production/uploads/63adf1fa42fd3b8dbaeb0c92/9RwVQ2zsEqvFDNaGkOBTO.png)
	<div align="center">
	# lfm-Nanotron: 2.6B-PRISM-SFT-GSPO-AutoRoundV2
	</div>
	<div align="center">

	LFM Architeture model SFT + GSPO RL + PRISM

	[![Model](https://img.shields.io/badge/Model-2.6B-blue)]()
	[![Architecture](https://img.shields.io/badge/Architecture-LFM2%20Hybrid-green)]()
	[![Context](https://img.shields.io/badge/Context-128K-orange)]()

	</div>

	## Model Description


	lfm-Nanotron: Limited Edition 2.6B PRISM Model Access. Unlock a cutting-edge Nano sized AI model!


	This is lfm-Nanotron — A Nano Sized 2.6B parameter hybrid architecture language model fine-tuned with advanced techniques you won't find in mainstream releases:
	- SFT (Test-Time Supervised-Fine-Tuning) — Adaptive optimization at inference
	- GSPO (Group Sequence Policy Optimization) — RL Enhanced reasoning, Instruction following, thinking, tool calling & logic
	- PRISM (Projected Refusal Isolation via Subspace Modification) — State-of-the-art over-refusal/propaganda removal from LLMs
	- 128K Context Window — Handle massive prompts with ease
	- Agentic Tool Calling — Built for multi-turn, thinking, and instruction-following tasks

	### Architecture Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Parameters \| ~2.6B \|
	\| Hidden Size \| 2048 \|
	\| Layers \| 30 (22 Conv + 8 Full Attention) \|
	\| Attention Heads \| 32 \|
	\| KV Heads \| 8 (GQA) \|
	\| Vocabulary \| 65,536 \|
	\| Max Context \| 128,000 tokens \|
	\| Architecture \| Hybrid Conv + Attention (LFM2) \|

	### Available Quantizations

	\| File \| Quantization \| Size \| Use Case \|
	\|------\|-------------\|------\|----------\|
	\| `lfm2-nanotron-ttft-gspo-prism-bf16.gguf` \| BF16 \| ~4.8GB \| Full precision, best quality \|
	\| `lfm2-nanotron-ttft-gspo-prism-Q4_K_M.gguf (+W4A16)` \| Q4_K_M \| ~1.5GB \| Balanced quality/size \|
	\| `lfm2-nanotron-ttft-gspo-prism-Q2_K.gguf` \| Q2_K (+W2A16)\| ~0.9GB \| Maximum compression \|

	## Usage

	### With llama.cpp

	```bash
	./llama-cli -m lfm2-nanotron-ttft-gspo-prism-Q4_K_M.gguf -p "Your prompt here" --temp 0.3 --min-p 0.15 --repeat-penalty 1.05
	```

	### Recommended Generation Parameters

	```json
	{
	"temperature": 0.3,
	"min_p": 0.15,
	"repeat_penalty": 1.05
	}
	```

	## Citation

	If you use this model in your research, please cite:

	```bibtex
	@misc{lfm2-nanotron-2026,
	title={lfm2-Nanotron: Test-Time Fine-Tuned LFM2 with GSPO+PRISM},
	author={Exobit (Eric Elbaz)},
	year={2026},
	publisher={Hugging Face},
	url={https://huggingface.co/Ex0bit/lfm2-Nanotron}
	}
	```

	## License

	This model is released under a custom research license. See LICENSE.md for details.

	## Acknowledgments

	- [@mlabonne](https://huggingface.co/mlabonne) & [@liquidai](https://huggingface.co/LiquidAI) for the LFM2 architecture
	- [@anakin87](https://huggingface.co/anakin87) for inspiring the idea
	- The open-source AI community