Chunk Size
Chunk size is the target length of each document chunk, measured in tokens, characters, or sentences depending on the chunker. Production
RAG systems typically use chunk sizes in the 200 to 1000 token range, with 512 tokens being a common default that fits comfortably in older
embedding model context limits while preserving enough context for meaningful comparison. Smaller chunks (100-300 tokens) increase retrieval precision (each chunk is highly focused) at the cost of context completeness (fragmented information may need multiple chunks). Larger chunks (500-1500 tokens) preserve context but reduce retrieval precision and may exceed
embedding model context windows. Some newer
embedding models like Jina v3, BGE-M3, and OpenAI text-embedding-3-large support 8,192-token contexts that allow chunks an order of magnitude larger than older models. AI governance teams document chunk size as a foundational pipeline parameter and validate retrieval quality after any change because chunk size interacts subtly with chunk overlap, retrieval top-k, and the
LLM context budget. Empirical optimization through Recall@k benchmarks is the standard validation approach.
Chunk size tuning in Centralpoint: Centralpoint logs retrieval-plus-generation outcomes per skill so administrators can validate chunk size choices against actual production traffic. The model-agnostic platform routes to OpenAI, Anthropic, Gemini, or LLAMA, meters tokens, keeps prompts local, and deploys retrieval-augmented chatbots through one line of JavaScript.
Related Keywords:
Chunk Size,
,