ALiBi

ALiBi, short for Attention with Linear Biases, is a positional encoding technique introduced by Press, Smith, and Lewis in a 2021 paper that adds linear-distance penalties directly to attention scores rather than modifying queries or keys. The penalty grows linearly with the distance between query and key positions, scaled by a per-head slope coefficient. ALiBi has the remarkable property of strong extrapolation: a model trained on 1024-token sequences can generate coherent output at 16K or 32K tokens with no fine-tuning, much better than learned absolute embeddings achieve. The technique is used in BLOOM (BigScience's 176B multilingual model), MPT (MosaicML), and Replit Code. ALiBi has been somewhat supplanted by RoPE for newer frontier models because RoPE plus context extension techniques like YaRN achieve longer effective context lengths, but ALiBi remains in active use particularly in research and specialized deployments. AI governance teams document the positional encoding choice in model architecture lineage because it affects long-context behavior.

ALiBi-based models with Centralpoint: Centralpoint operates above ALiBi-based and RoPE-based models in a model-agnostic platform. Tokens are metered per skill, prompts stay local, and chatbots deploy through one line of JavaScript on any portal with audit-ready governance.


Related Keywords:
ALiBi,,