NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models
Paper
• 2602.06694 • Published
• 15
None defined yet.
NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models
RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs