BOS Token

The BOS token (Beginning Of Sequence) is a special token that marks the start of model input, signaling to the model that what follows is the beginning of a new generation context rather than a continuation. Many LLMs require or expect a BOS token at the start of every input — without it, model behavior can degrade subtly, producing lower-quality completions or behaving as if the input is a mid-generation continuation. Different models use different BOS tokens: GPT family models use <|endoftext|> or no explicit BOS, Llama uses , Claude uses internal markers, and BERT-family encoder models use [CLS]. Many chat-tuned models combine BOS with role markers to demarcate the start of a conversation. AI governance teams document BOS token usage as part of prompt template specifications because missing or extra BOS tokens are a common cause of silent quality regressions when migrating between model versions. Most model SDKs handle BOS automatically through chat templates, hiding the complexity from application developers.

BOS handling in Centralpoint: Centralpoint's Prompt Manager applies the correct BOS token automatically for whichever model — OpenAI, Anthropic, Gemini, Llama, embedded — a skill routes to, eliminating manual template management. Prompts stay local, tokens are metered, and template-aware chatbots embed through one line of JavaScript across portals.

Related Keywords:
BOS Token,,

Back