In a Training Loop 🔄

Gheorghe Chesler PRO

nightmedia

AI & ML interests

Nightmedia: human-Like AI and the MLX Deckard(qx) Formula Donations are appreciated: BTC:36d7U1n3MFaXgnNRAaEL3Pa3Hy6oFhM7XY

Recent Activity

updated a collection about 12 hours ago

IBM Granite 4.1

updated a collection about 12 hours ago

IBM Granite 4.1

updated a model about 12 hours ago

nightmedia/granite-4.1-8b-Pebbles-Flintstone-mxfp8-mlx

View all activity

Organizations

Posts 6

Post

248

IBM Granite 4.1 series

New models came up, here is how they compare to models in the same size:

Brainwaves

arc   arc/e boolq hswag obkqa piqa  wino
granite-4.1-30b
mxfp8    0.456,0.572,0.897,0.621,0.444,0.757,0.616
mxfp4    0.453,0.565,0.892,0.624,0.442,0.759,0.585
qx86-hi  0.451,0.568,0.897,0.636,0.440,0.763,0.598

granite-4.1-8b
mxfp8    0.486,0.666,0.875,0.636,0.450,0.766,0.631

granite-4.1-3b
mxfp8    0.406,0.581,0.821,0.484,0.434,0.712,0.559

Gemma-4

quant    arc   arc/e boolq hswag obkqa piqa  wino
gemma-4-E4B-it
mxfp8    0.480,0.656,0.797,0.608,0.400,0.755,0.665
mxfp4    0.455,0.607,0.851,0.585,0.402,0.744,0.651

gemma-4-E2B-it
mxfp8    0.376,0.464,0.743,0.490,0.378,0.709,0.622
mxfp4    0.380,0.451,0.762,0.494,0.374,0.699,0.594

Qwen3.5

quant    arc   arc/e boolq hswag obkqa piqa  wino
Qwen3.5-9B
mxfp8    0.417,0.458,0.623,0.634,0.338,0.737,0.639
mxfp4    0.419,0.472,0.622,0.634,0.352,0.739,0.644

Qwen3.5-4B
mxfp8    0.392,0.441,0.627,0.601,0.360,0.739,0.590
mxfp4    0.371,0.444,0.632,0.585,0.356,0.732,0.548

Right out of the gate, IBM delivered models with better starting metrics than both Gemma and Qwen. Training these should be fun :)

Here is the Nightmedia collection of Granite models

https://huggingface.co/collections/nightmedia/ibm-granite-41

-G

Post

3050

Updated gemma-4-E4B-it metrics

I noticed the chat template got updated, and tried it on the E4B, with surprising results in stabilizing the brainwave.

quant    arc   arc/e boolq hswag obkqa piqa  wino
mxfp8    0.480,0.656,0.797,0.608,0.400,0.755,0.665
mxfp4    0.455,0.607,0.851,0.585,0.402,0.744,0.651

Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8    35.937 ± 0.525  14.80 GB      1153
mxfp4    36.746 ± 0.534  11.06 GB      1030

Old numbers

quant    arc   arc/e boolq hswag obkqa piqa  wino
mxfp8    0.404,0.489,0.825,0.586,0.392,0.734,0.661
mxfp4    0.414,0.508,0.854,0.562,0.378,0.717,0.645

Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8    34.652 ± 0.502  14.80 GB      1146
mxfp4    35.203 ± 0.506  11.06 GB      1200

I will re-do all baselines soon based on the new template. It is completely expected that the model behavior will change as a result.

Here are the effects of the new template on few known distills from DavidAU

gemma-4-E4B-it-The-DECKARD-Expresso-Universe-HERETIC-UNCENSORED

quant    arc   arc/e boolq hswag obkqa piqa  wino
New template
mxfp8    0.518,0.709,0.755,0.657,0.418,0.759,0.626
mxfp4    0.485,0.682,0.792,0.641,0.432,0.746,0.635
Old template
mxfp8    0.506,0.697,0.754,0.661,0.416,0.757,0.627
mxfp4    0.487,0.670,0.792,0.644,0.430,0.748,0.624

gemma-4-E4B-it-GLM-4.7-Flash-HERETIC-UNCENSORED-Thinking

mxfp8   0.461,0.599,0.779,0.630,0.406,0.766,0.629
Old template
mxfp8   0.456,0.580,0.786,0.629,0.410,0.764,0.633

gemma-4-E4B-it-Claude-Opus-4.5-HERETIC-UNCENSORED-Thinking

mxfp8    0.509,0.705,0.806,0.646,0.416,0.773,0.650
Old template
mxfp8    0.502,0.692,0.809,0.650,0.420,0.771,0.651

View all Posts

Collections 28

View 28 collections

models 507

datasets 0

None public yet

Gheorghe Chesler PRO

AI & ML interests

Recent Activity

Organizations

Posts 6

Collections 28

nightmedia/granite-4.1-30b-mxfp8-mlx

nightmedia/granite-4.1-30b-mxfp4-mlx

nightmedia/granite-4.1-8b-mxfp8-mlx

nightmedia/granite-4.1-3b-mxfp8-mlx

nightmedia/Qwen3.6-35B-A3B-qx86-hi-mlx

nightmedia/Qwen3.6-35B-A3B-Text-qx64-mlx

nightmedia/Qwen3.6-35B-A3B-Holo3-mxfp8-mlx

nightmedia/Qwen3.6-35B-A3B-Architect-Qwopus-mxfp8-mlx

nightmedia/granite-4.1-30b-mxfp8-mlx

nightmedia/granite-4.1-30b-mxfp4-mlx

nightmedia/granite-4.1-8b-mxfp8-mlx

nightmedia/granite-4.1-3b-mxfp8-mlx

nightmedia/Qwen3.6-35B-A3B-qx86-hi-mlx

nightmedia/Qwen3.6-35B-A3B-Text-qx64-mlx

nightmedia/Qwen3.6-35B-A3B-Holo3-mxfp8-mlx

nightmedia/Qwen3.6-35B-A3B-Architect-Qwopus-mxfp8-mlx

models 507

nightmedia/granite-4.1-8b-Pebbles-Flintstone-mxfp8-mlx

nightmedia/granite-4.1-8b-Fred-Flintstone-mxfp8-mlx

nightmedia/granite-4.1-30b-mxfp4-mlx

nightmedia/granite-4.1-30b-mxfp8-mlx

nightmedia/granite-4.1-8b-Stone-Cold-Thinking-V1-mxfp8-mlx

nightmedia/granite-4.1-3b-Abliterated-AND-Disinhibited-mxfp8-mlx

nightmedia/granite-4.1-8b-Abliterated-AND-Disinhibited-mxfp8-mlx

nightmedia/granite-4.1-8b-FlintStones-Stone-Cold-Thinking-mxfp8-mlx

nightmedia/Qwen3.6-27B-Architect-Polaris-mxfp8-mlx

nightmedia/Qwen3.6-27B-Architect-Polaris-mxfp4-mlx

datasets 0

Gheorghe Chesler PRO

AI & ML interests

Recent Activity

Organizations

Posts 6

Collections 28

models 507 Sort: Recently updated

datasets 0

models 507