Dual PatchNorm from Google – Two layernorms before and after the patch embedding layer in vision transformers

We propose Dual PatchNorm: two Layer Normalization layers (LayerNorms), before and after the patch embedding layer in Vision Transformers. We demonstrate that Dual PatchNorm outperforms the result of exhaustive search for alternative LayerNorm placement strategies in the Transformer block itself. In our experiments, incorporating this trivial modification, often leads to improved accuracy over well-tuned Vision Transformers and never hurts.

Pricing:Free
Trial available?No
dualpatchnorm

Leave a Reply

Help us find great AI content

Newsletter

Never miss a thing! Sign up for our AI Hackr newsletter to stay updated.

About

AI curated tools and resources. Find the best AI tools, reports, research entries, writing assistants, chrome extensions and GPT tools.

Submit