AxionLab's picture
Building on HF

AxionLab

AxionLab-official

AI & ML interests

my goal is make the best NRM(Nano Reasoning Model), in this profile i try it! *NOTE: the RX 5500 XT GPU with 8GB VRAM in my profile is a Radeon Vega 7 8GB VRAM irl*

Recent Activity

liked a model about 2 hours ago
Crowfeather/Crowfeather-50m
reacted to Crownelius's post with πŸ”₯ about 2 hours ago
[DAY ONE] PROJECT CROWFEATHER 4/30/2026 ...The day I forgot to attach wandb.ai Just dropped Crowfeather-50m, the first checkpoint in a series, and yeah, no graphs. https://huggingface.co/Crowfeather/Crowfeather-50m 54.5M params. Pretrain only. 17,500 steps banked on FineWeb-edu before Thunder credits ran dry. About 2.3B tokens, no SFT yet. Architecture: Gemma-4 alternating sliding/global attention (1024 window, last layer always global) plus DeepSeek-V4 Muon optimizer plus WSD scheduler plus Gemma-2 logit soft-cap plus PaLM z-loss. Recipe in the model card. What it can do: writes grammatical English. Knows that France has Rhine-adjacent monasteries (it picked Rouen instead of Paris but the vocabulary is in there). Tells stories about Mr. Fabien. What it can't do yet: facts, code, math. Base LM, no SFT, no instruction tuning. The series: Every additional training run becomes another model card here Every model card gets a matching post on this profile Continuation goes to Colab next, picking up from step 17500 out of 100k Limited to one post a day on Hugging Face, so updates will trickle out at that pace. Follow [@Crownelius](https://huggingface.co/Crownelius) and [@Crowfeather](https://huggingface.co/Crowfeather) if you want to watch this thing learn in public. Next drop will either come with the finished pre-train or whatever step I land on before the bank takes my credit card away. Graphs will be available on my NEXT model lol -Shane
reacted to Crownelius's post with πŸ‘ about 2 hours ago
[DAY ONE] PROJECT CROWFEATHER 4/30/2026 ...The day I forgot to attach wandb.ai Just dropped Crowfeather-50m, the first checkpoint in a series, and yeah, no graphs. https://huggingface.co/Crowfeather/Crowfeather-50m 54.5M params. Pretrain only. 17,500 steps banked on FineWeb-edu before Thunder credits ran dry. About 2.3B tokens, no SFT yet. Architecture: Gemma-4 alternating sliding/global attention (1024 window, last layer always global) plus DeepSeek-V4 Muon optimizer plus WSD scheduler plus Gemma-2 logit soft-cap plus PaLM z-loss. Recipe in the model card. What it can do: writes grammatical English. Knows that France has Rhine-adjacent monasteries (it picked Rouen instead of Paris but the vocabulary is in there). Tells stories about Mr. Fabien. What it can't do yet: facts, code, math. Base LM, no SFT, no instruction tuning. The series: Every additional training run becomes another model card here Every model card gets a matching post on this profile Continuation goes to Colab next, picking up from step 17500 out of 100k Limited to one post a day on Hugging Face, so updates will trickle out at that pace. Follow [@Crownelius](https://huggingface.co/Crownelius) and [@Crowfeather](https://huggingface.co/Crowfeather) if you want to watch this thing learn in public. Next drop will either come with the finished pre-train or whatever step I land on before the bank takes my credit card away. Graphs will be available on my NEXT model lol -Shane
View all activity

Organizations

AxionLab Co.'s profile picture ML intern explorers's profile picture CompactAI's profile picture NanoAxion Models Family's profile picture