Adapter Layers

Adapter layers are a PEFT technique introduced by Houlsby et al. in 2019 that inserts small bottleneck feed-forward modules between the frozen layers of a pretrained transformer, training only the adapters while keeping the base model frozen. Each adapter typically uses a down-projection, nonlinearity, and up-projection structure that adds 1% to 5% of base model parameters. The original adapter formulation predates LoRA by two years and established the parameter-efficient fine-tuning paradigm. Modern adapter variants include AdapterFusion (combining multiple adapters at inference), Compacter (more parameter-efficient adapter formulations), and the IA3 method (which uses learned scaling vectors instead of bottleneck modules). Tools including AdapterHub and the Hugging Face PEFT library support adapter layers as one option among many PEFT methods. AI governance teams favor adapters for multi-task scenarios where many task-specific adapters share one base model, enabling fast task switching at inference. LoRA has largely displaced classical adapters in 2023-2025 production workflows but adapters remain in active use in research and specialized deployments.

Adapter-routed inference through Centralpoint: Centralpoint coordinates adapter-routed inference across multiple task-specific adapters sharing a base model, all in a model-agnostic stack. Tokens are metered per adapter and audience, prompts stay local, and adapter-aware chatbots embed across portals with one line of JavaScript.

Related Keywords:
Adapter Layers,,

Back