Skip to main content
Publication

Attention is not all you need: pure attention loses rank doubly exponentially with depth