Papers
arxiv:2605.21171

FTerViT: Fully Ternary Vision Transformer

Published on May 20
Authors:
,
,
,
,
,

Abstract

Full ternarization of Vision Transformers achieves significant model compression with enhanced deployment efficiency on resource-constrained devices through novel operators and quantization-aware training.

AI-generated summary

Ternary Vision Transformers offer substantial model compression, however state-of-the-art methods only ternarize the encoder layers, leaving patch embeddings, LayerNorm parameters, and classifier heads in full precision. In compact models targeting resource-constrained processors, such as microcontrollers, these remaining full-precision components determine the total memory footprint, severely limiting deployment efficiency and on-device feasibility. In this work, we introduce a fully ternarized Vision Transformer in which all weight matrices and normalization parameters are ternarized (FTerViT). To this end, we introduce two novel operators : TernaryBitConv2d with per-channel scaling for patch embedding and TernaryLayerNorm. FTerViT is trained using knowledge distillation, followed by a lightweight quantization-aware recovery phase. Our ternary W2A8 DeiT-III-S at 384times384 resolution achieves 82.43\% ImageNet-1K top-1 at 6.09\,MB ({sim}15times compression, -2.42\,pp vs.\ FP32), outperforming prior ternary ViTs methods up to 8 pp. Finally, we demonstrate the first implementation of ternary vision transformers on a dual cores XTensa LX7 microcontroller inside the ESP32-S3 system-on-chip. By deploying FTerViT-Small (based on DeiT-III-Small at 224times224 resolution, 5.81\,MB), we achieve 79.64\% ImageNet-1K top-1 accuracy.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.21171
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.21171 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.