You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Innovation.AI.X – Recursive Language Model (RLM)

Innovation.AI.X is a Recursive Language Model with a Hybrid Mind architecture. It is not a traditional Large Language Model (LLM) but a single monolithic tensor shell where multiple Self‑Automated (S.A.) subsystems operate simultaneously on a shared latent state, performing recursive state evolution rather than simple next‑token prediction.

Model Overview

Parameters: 110,997,082 (~111M)
Context Window: 64,000 tokens (NTK‑aware Rotary Position Embeddings)
Training Regime: Tabula Rasa – all weights randomly initialised, no pretrained components
Multimodal Support: Text, Image, Audio, Video – all projected into a shared latent space
KV Cache: Incremental key‑value cache for efficient autoregressive generation
Mixed Precision: BF16 with Accelerate for Dual T4 GPU deployment

Architecture: The Hybrid Mind

The model operates as a unified cognitive state. 20 Self‑Automated (S.A.) subsystems participate in every recursive cycle, all reading from and writing to the same shared_latent_state. No subsystem is a post‑processing step; each contributes to the evolving internal representation.

Subsystem Implementations

Subsystem	Description	Implementation
S.A. Meta Learning	Task adaptation via hypernetwork‑generated FiLM layers	Hypernetwork → per‑layer scale & shift modulation
S.A. Reinforcement Learning	Actor‑Critic with reward prediction	Separate heads: actor, critic, reward_pred
S.A. Continual Learning	Elastic Weight Consolidation (EWC)	Buffered Fisher information & optimal parameters
S.A. Adaptive Learning	Context‑conditioned dynamic gating	Per‑layer sigmoid gate conditioned on latent mean
S.A. Rewriting Learning	Residual correction of latent representations	Bottleneck MLP applied with scaling factor
S.A. NLP	Semantic compression and language understanding	Bottleneck compress‑expand network
S.A. Problem Solving	Multi‑step reasoning with hidden scratchpad	GRU‑based recurrent workspace (no visible chain‑of‑thought)
S.A. Innovation	Controlled perturbation for novel ideas	Learnable Gaussian noise injection during training
S.A. Debugging	Consistency detection and anomaly repair	Confidence gate + corrective vector
S.A. Long/Short Term Memory	Differentiable read/write memory	DNC‑style memory with cosine‑based addressing
S.A. Recursive Seed Learning	Concept compression into compact latent seeds	Bottleneck encoder‑decoder (reconstruction loss)
S.A. Self Evaluation & Reward	Confidence and quality estimation	Twin heads: confidence, quality
S.A. Goal & Constraint Engine	Goal embedding maintenance	Learnable goal embeddings, injected via mean pool
S.A. Memory Consolidation	Transfer active memory to stable memory	Linear consolidation projection
S.A. Introspection Interface	Self‑observation and uncertainty estimation	Uncertainty head + Tanh‑activated observation layer
S.A. Recursive Outer Loop	Adaptive computation time / dynamic halting	Halt probability via sigmoid gate
S.A. Conversational Intelligence	Chatbot session memory and dialogue tracking	Persistent memory token + gated fusion
S.A. Tabula Rasa	Fresh random initialisation enforcement	Learnable freshness scale factor (active gradient path)
S.A. Emotional Intelligence	Emotion modelling and influence	6 emotion prototypes + attention‑based mixing
S.A. Common Sense	Generalisable pattern extraction	Bottleneck projection that forces abstraction

Recursive Processing

The forward pass executes a loop of num_recursion_steps (default 3).
At each step:

All S.A. modules read the pooled latent mean and produce modifications.
Modifications are accumulated (delta_sum) and applied to the shared state.
The shared state passes through all transformer layers, modulated by FiLM parameters and adaptive gates.
Memory is read/written, problem‑solving state updated, debugging corrections applied.
After the loop, the final state is layer‑normalised and projected to vocabulary logits.

This design allows the model to refine its internal representation recursively, using all cognitive subsystems in every iteration.

Multimodal Fusion

The model accepts four modalities:

Text: Tokenised input → embedding + text modality tag
Image: RGB image → convolutional patch embedding + image modality tag
Audio: Mel‑spectrogram → 1D convolutional embedding + audio modality tag
Video: Video clip → 3D convolutional tubelet embedding + video modality tag

All embeddings are concatenated into a single sequence and processed jointly.
Modality‑type embeddings are learned and added to distinguish input sources.

Context Window: 64K Tokens with NTK‑RoPE

Standard RoPE loses high‑frequency resolution when extrapolating to very long sequences.
Innovation.AI.X uses NTK‑aware scaling (α=4.0) that adjusts the rotary base frequency:


scaled_theta = theta * (alpha ** (dim / (dim - 2)))

This spreads the frequency spectrum to preserve both local and global attention quality up to 64,000 tokens.
An incremental KV cache is implemented to support efficient autoregressive generation.

Training Loss Function

The composite loss drives the entire Hybrid Mind:

loss_total = loss_lm + 0.2*loss_seed + 0.1*loss_rl + 100.0*loss_ewc

· loss_lm: Standard causal language modelling cross‑entropy · loss_seed: MSE between the seed reconstruction and the latent mean (encourages concept compression) · loss_rl: MSE between predicted reward and a bootstrap target (currently 1.0 for bootstrapping) · loss_ewc: Elastic weight consolidation penalty (Fisher‑weighted deviation from optimal parameters)

All components are computed from a single forward pass, enabling end‑to‑end training.

Tabula Rasa Initialisation

Every weight in the model is initialised from a normal distribution (std=0.02) or zeros (biases). The SA_TabulaRasa module contains a learnable parameter fresh (initialised to 1.0) that is always part of the computational graph, guaranteeing that the model was born from randomness – no pretrained checkpoints, no inherited biases.

Usage

Loading the model

from modeling_innovation_ai_x import InnovationAIX, InnovationAIXConfig
import torch

config = InnovationAIXConfig()
model = InnovationAIX(config)
model.load_state_dict(torch.load('model.safetensors'))  # or use safetensors
model.eval()

Text generation

tokenizer = ...  # load your custom tokenizer
input_ids = tokenizer.encode("Hello, world!").ids
input_ids = torch.tensor([input_ids])

with torch.no_grad():
    outputs = model(input_ids=input_ids)
    logits = outputs['logits']
    # sample next token ...

Multimodal input

outputs = model(
    input_ids=text_ids,
    pixel_values=image_tensor,
    audio_features=audio_tensor,
    video_frames=video_tensor
)

The model accepts any combination of modalities; empty inputs are simply omitted.

Configuration

Parameter Value vocab_size 32,000 d_model 640 num_heads 10 num_layers 12 intermediate_size 3,583 (auto‑adjusted) num_recursion_steps 3 max_seq_len 64,000 memory_slots 64 memory_dim 256 num_goals 8 dropout 0.1 ntk_alpha (NTK factor) 4.0

Repository Structure

Innovation.AI.X/
├── model.safetensors            # Model weights
├── config.json                  # Model hyperparameters
├── generation_config.json       # Default generation settings
├── tokenizer.json               # Tokenizer data
├── tokenizer_config.json        # Tokenizer configuration
├── special_tokens_map.json      # Special token mapping
├── modeling_innovation_ai_x.py  # Full model source
└── README.md                    # This file

Intended Use & Limitations

Innovation.AI.X is a research artifact demonstrating the feasibility of a Recursive Language Model with a Hybrid Mind architecture. It is not instruction‑tuned and has not been trained on any corpus yet – the released weights are purely randomly initialised (Tabula Rasa). Researchers can use this model as a foundation for:

· Exploring recursive cognitive architectures · Multimodal representation learning from scratch · Reinforcement learning from self‑generated rewards · Continual / lifelong learning experiments

As with any randomly initialised network, do not expect coherent language generation until it has been properly trained on a large dataset.

Citation

If you use Innovation.AI.X in your work, please cite:

@misc{InnovationAIX2026,
  author = {GODsStrongestSoldier},
  title = {Innovation.AI.X: A Recursive Language Model with Hybrid Mind Architecture},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/GODsStrongestSoldier/Innovation.AI.X}},
}

Acknowledgments

This model was built with PyTorch, Hugging Face tokenizers & huggingface_hub, safetensors, and accelerate. Special thanks to the Kaggle platform for providing Dual T4 GPU resources.

Innovation.AI.X – Not a Large Language Model. A Recursive Language Model.

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Collection including WithinUsAI/Innovation.AI.X

“WithIn Us AI” (Recursive Models)

Collection

Recursive Language Models designed By (WithIn Us AI) at core. The RLM’s are in total 11. All are ready base models for pre-training • 9 items • Updated about 2 hours ago • 2