You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Innovation.AI.X – Recursive Language Model (RLM)

Innovation.AI.X is a Recursive Language Model with a Hybrid Mind architecture. It is not a traditional Large Language Model (LLM) but a single monolithic tensor shell where multiple Self‑Automated (S.A.) subsystems operate simultaneously on a shared latent state, performing recursive state evolution rather than simple next‑token prediction.


Model Overview

  • Parameters: 110,997,082 (~111M)
  • Context Window: 64,000 tokens (NTK‑aware Rotary Position Embeddings)
  • Training Regime: Tabula Rasa – all weights randomly initialised, no pretrained components
  • Multimodal Support: Text, Image, Audio, Video – all projected into a shared latent space
  • KV Cache: Incremental key‑value cache for efficient autoregressive generation
  • Mixed Precision: BF16 with Accelerate for Dual T4 GPU deployment

Architecture: The Hybrid Mind

The model operates as a unified cognitive state. 20 Self‑Automated (S.A.) subsystems participate in every recursive cycle, all reading from and writing to the same shared_latent_state. No subsystem is a post‑processing step; each contributes to the evolving internal representation.

Subsystem Implementations

Subsystem Description Implementation
S.A. Meta Learning Task adaptation via hypernetwork‑generated FiLM layers Hypernetwork → per‑layer scale & shift modulation
S.A. Reinforcement Learning Actor‑Critic with reward prediction Separate heads: actor, critic, reward_pred
S.A. Continual Learning Elastic Weight Consolidation (EWC) Buffered Fisher information & optimal parameters
S.A. Adaptive Learning Context‑conditioned dynamic gating Per‑layer sigmoid gate conditioned on latent mean
S.A. Rewriting Learning Residual correction of latent representations Bottleneck MLP applied with scaling factor
S.A. NLP Semantic compression and language understanding Bottleneck compress‑expand network
S.A. Problem Solving Multi‑step reasoning with hidden scratchpad GRU‑based recurrent workspace (no visible chain‑of‑thought)
S.A. Innovation Controlled perturbation for novel ideas Learnable Gaussian noise injection during training
S.A. Debugging Consistency detection and anomaly repair Confidence gate + corrective vector
S.A. Long/Short Term Memory Differentiable read/write memory DNC‑style memory with cosine‑based addressing
S.A. Recursive Seed Learning Concept compression into compact latent seeds Bottleneck encoder‑decoder (reconstruction loss)
S.A. Self Evaluation & Reward Confidence and quality estimation Twin heads: confidence, quality
S.A. Goal & Constraint Engine Goal embedding maintenance Learnable goal embeddings, injected via mean pool
S.A. Memory Consolidation Transfer active memory to stable memory Linear consolidation projection
S.A. Introspection Interface Self‑observation and uncertainty estimation Uncertainty head + Tanh‑activated observation layer
S.A. Recursive Outer Loop Adaptive computation time / dynamic halting Halt probability via sigmoid gate
S.A. Conversational Intelligence Chatbot session memory and dialogue tracking Persistent memory token + gated fusion
S.A. Tabula Rasa Fresh random initialisation enforcement Learnable freshness scale factor (active gradient path)
S.A. Emotional Intelligence Emotion modelling and influence 6 emotion prototypes + attention‑based mixing
S.A. Common Sense Generalisable pattern extraction Bottleneck projection that forces abstraction

Recursive Processing

The forward pass executes a loop of num_recursion_steps (default 3).
At each step:

  1. All S.A. modules read the pooled latent mean and produce modifications.
  2. Modifications are accumulated (delta_sum) and applied to the shared state.
  3. The shared state passes through all transformer layers, modulated by FiLM parameters and adaptive gates.
  4. Memory is read/written, problem‑solving state updated, debugging corrections applied.
  5. After the loop, the final state is layer‑normalised and projected to vocabulary logits.

This design allows the model to refine its internal representation recursively, using all cognitive subsystems in every iteration.


Multimodal Fusion

The model accepts four modalities:

  • Text: Tokenised input → embedding + text modality tag
  • Image: RGB image → convolutional patch embedding + image modality tag
  • Audio: Mel‑spectrogram → 1D convolutional embedding + audio modality tag
  • Video: Video clip → 3D convolutional tubelet embedding + video modality tag

All embeddings are concatenated into a single sequence and processed jointly.
Modality‑type embeddings are learned and added to distinguish input sources.


Context Window: 64K Tokens with NTK‑RoPE

Standard RoPE loses high‑frequency resolution when extrapolating to very long sequences.
Innovation.AI.X uses NTK‑aware scaling (α=4.0) that adjusts the rotary base frequency:


scaled_theta = theta * (alpha ** (dim / (dim - 2)))

This spreads the frequency spectrum to preserve both local and global attention quality up to 64,000 tokens.
An incremental KV cache is implemented to support efficient autoregressive generation.


Training Loss Function

The composite loss drives the entire Hybrid Mind:

loss_total = loss_lm + 0.2*loss_seed + 0.1*loss_rl + 100.0*loss_ewc

· loss_lm: Standard causal language modelling cross‑entropy · loss_seed: MSE between the seed reconstruction and the latent mean (encourages concept compression) · loss_rl: MSE between predicted reward and a bootstrap target (currently 1.0 for bootstrapping) · loss_ewc: Elastic weight consolidation penalty (Fisher‑weighted deviation from optimal parameters)

All components are computed from a single forward pass, enabling end‑to‑end training.


Tabula Rasa Initialisation

Every weight in the model is initialised from a normal distribution (std=0.02) or zeros (biases). The SA_TabulaRasa module contains a learnable parameter fresh (initialised to 1.0) that is always part of the computational graph, guaranteeing that the model was born from randomness – no pretrained checkpoints, no inherited biases.


Usage

Loading the model

from modeling_innovation_ai_x import InnovationAIX, InnovationAIXConfig
import torch

config = InnovationAIXConfig()
model = InnovationAIX(config)
model.load_state_dict(torch.load('model.safetensors'))  # or use safetensors
model.eval()

Text generation

tokenizer = ...  # load your custom tokenizer
input_ids = tokenizer.encode("Hello, world!").ids
input_ids = torch.tensor([input_ids])

with torch.no_grad():
    outputs = model(input_ids=input_ids)
    logits = outputs['logits']
    # sample next token ...

Multimodal input

outputs = model(
    input_ids=text_ids,
    pixel_values=image_tensor,
    audio_features=audio_tensor,
    video_frames=video_tensor
)

The model accepts any combination of modalities; empty inputs are simply omitted.


Configuration

Parameter Value vocab_size 32,000 d_model 640 num_heads 10 num_layers 12 intermediate_size 3,583 (auto‑adjusted) num_recursion_steps 3 max_seq_len 64,000 memory_slots 64 memory_dim 256 num_goals 8 dropout 0.1 ntk_alpha (NTK factor) 4.0


Repository Structure

Innovation.AI.X/
├── model.safetensors            # Model weights
├── config.json                  # Model hyperparameters
├── generation_config.json       # Default generation settings
├── tokenizer.json               # Tokenizer data
├── tokenizer_config.json        # Tokenizer configuration
├── special_tokens_map.json      # Special token mapping
├── modeling_innovation_ai_x.py  # Full model source
└── README.md                    # This file

Intended Use & Limitations

Innovation.AI.X is a research artifact demonstrating the feasibility of a Recursive Language Model with a Hybrid Mind architecture. It is not instruction‑tuned and has not been trained on any corpus yet – the released weights are purely randomly initialised (Tabula Rasa). Researchers can use this model as a foundation for:

· Exploring recursive cognitive architectures · Multimodal representation learning from scratch · Reinforcement learning from self‑generated rewards · Continual / lifelong learning experiments

As with any randomly initialised network, do not expect coherent language generation until it has been properly trained on a large dataset.


Citation

If you use Innovation.AI.X in your work, please cite:

@misc{InnovationAIX2026,
  author = {GODsStrongestSoldier},
  title = {Innovation.AI.X: A Recursive Language Model with Hybrid Mind Architecture},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/GODsStrongestSoldier/Innovation.AI.X}},
}

Acknowledgments

This model was built with PyTorch, Hugging Face tokenizers & huggingface_hub, safetensors, and accelerate. Special thanks to the Kaggle platform for providing Dual T4 GPU resources.


Innovation.AI.X – Not a Large Language Model. A Recursive Language Model.


Downloads last month
3
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including WithinUsAI/Innovation.AI.X