Elastic Attention Cores for Scalable Vision Transformers [R]

“`html A British research team has introduced a new architectural approach to Vision Transformers (ViTs), focusing on an alternative backbone with core-periphery…

By AI Maestro May 13, 2026 1 min read
Elastic Attention Cores for Scalable Vision Transformers [R]

“`html

  • A British research team has introduced a new architectural approach to Vision Transformers (ViTs), focusing on an alternative backbone with core-periphery block-sparse attention structures. This design aims to mitigate the computational costs associated with dense self-attention, which scales as (2NC + N2) for C core tokens.
  • The proposed method, named Elastic Attention Cores (EAC), employs nested dropout during training to enable dynamic adjustments of the number of cores at test time. This flexibility allows for a trade-off between inference cost and model performance across different resolutions, from 256×256 up to 1024×1024.

“`

### Takeaways:
– **Efficiency Gain**: The EAC architecture reduces computational complexity by scaling as (2NC + N2) instead of traditional dense self-attention, which scales as N2.
– **Scalability**: This innovation enables the model to maintain high accuracy across various resolutions without significant performance degradation.
– **Dynamic Adjustments**: The use of nested dropout allows for flexible control over the number of core tokens during inference, enabling efficient trade-offs between computation and performance.


Originally published at reddit.com. Curated by AI Maestro.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Name
Scroll to Top