ROOT: Robust Orthogonalized Optimizer for Neural Network Training Paper โข 2511.20626 โข Published Nov 25 โข 42
Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting Paper โข 2404.18911 โข Published Apr 29, 2024 โข 30
DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models Paper โข 2403.00818 โข Published Feb 26, 2024 โข 19
Running on CPU Upgrade Featured 992 Model Memory Utility ๐ 992 Calculate vRAM needed for model training and inference