Prompt Tuning

Prompt tuning, also called soft prompt tuning, is a PEFT technique introduced by Lester, Al-Rfou, and Constant (2021) that prepends a short sequence of learned continuous vectors directly to the input embeddings of a frozen LLM. Unlike prefix tuning which operates at every attention layer, prompt tuning operates only at the input layer, making it the simplest and most parameter-efficient PEFT method — typically training just a few thousand parameters total. The technique works surprisingly well at very large model scales (10B+ parameters) where the rich pretrained representations can absorb the soft prompt as effective task conditioning. At smaller model scales, prompt tuning often underperforms LoRA and other higher-capacity PEFT methods. Prompt tuning is also distinct from hard prompts (natural-language instructions) — the soft prompt vectors don't correspond to any token in the vocabulary and exist purely in continuous space. AI governance teams document prompt-tuning artifacts as part of their adapter inventory. The technique remains useful for very large models and constrained deployment scenarios where every trainable parameter matters.

Soft prompt tuning in Centralpoint: Centralpoint coordinates soft-prompt-tuned models alongside hard prompts from its Prompt Manager, all under one model-agnostic governance layer. Tokens are metered per skill and audience, prompts stay local, and tuned-model chatbots deploy through one line of JavaScript across portals.


Related Keywords:
Prompt Tuning,,