:::   :::   ::::::::::: ::::::::  :::::::::   ::::::::  ::::::::: 
  :+:+: :+:+:      :+:    :+:    :+: :+:    :+: :+:    :+: :+:    :+: 
  +:+ +:+:+ +:+     +:+    +:+        +:+    +:+ +:+    +:+ |:|    +:+  
   +#+  +:+  +#+     +#+    +#+        +#++:++#:  +#+    +:+ |#|    +:+   
    +#+       +#+     +#+    +#+        +#+    +#+ +#+    +#+ |#|    +#+    
###       ###     ###    ###    ### ###    ### ###    ### ###    ###
###      ###  ########### ########  ###   ###   ########  #########

MICROD v1.0 (micro-distill-grpo-vae)

This model was made with the Micro Distillery app available at:

webxos.netlify.app/MICROD

-Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
-Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
-VAE Filtering: Apply latent space compression to improve distillation quality.
-Sandbox Testing: Execute safe Python code with feedback masking.
-Export & Deployment: Generate deployable models for inference in various frameworks.
-Offline Usage: PWA supports offline training simulation and exports.

by webXOS

- **Model size**: 42M parameters

- **Model type**: micro-distill-grpo-vae

Model Description

This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering. MICROD v1.0 (micro-distill-grpo-vae) is a small template model designed to be built upon for custom ground up builds. It is distillated into a small set of files the user can use to template their own agents. Designed for educational learning and micro scalling. Use MICROD V1.0 (micro-distill-grpo-vae) in your own custom projects and train it from the ground up.

Model Details

Model type: micro-distill-grpo-vae
Model size: 42M parameters
Language: English
License: Apache 2.0

Training Methodology

GRPO (Group Relative Policy Optimization): 8 groups
VAE Filtering: 32D latent space
KV-Cache Reuse: 512 cache size

Architecture Details

Hidden size: 512
Number of layers: 8
Attention heads: 8
Vocabulary size: 50257
Maximum sequence length: 1024

Usage

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("micro-distill-grpo-vae")
tokenizer = AutoTokenizer.from_pretrained("micro-distill-grpo-vae")

inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Downloads last month: 155

Model tree for webxos/microd_v1

Base model

openai-community/gpt2

Quantized

(83)

this model