Sliding Window

Sliding window is a chunking strategy that produces a sequence of overlapping chunks by sliding a fixed-size window across the document with a fixed stride smaller than the window size. For example, a 500-token window with 250-token stride produces chunks at positions 0-500, 250-750, 500-1000, and so on — each chunk overlapping the next by 250 tokens. Sliding window is simple, predictable, and produces guaranteed coverage of all boundary regions, making it a robust baseline for RAG chunking. The cost is duplication — each token appears in multiple chunks, multiplying storage and retrieval volume — which can be significant for long documents or large corpora. Variants include adaptive stride (smaller stride near important boundaries), language-specific stride (one sentence at a time), and code-specific stride (one function at a time). AI governance teams choose sliding window when worst-case context preservation matters more than storage efficiency, common in legal e-discovery, medical reference, and compliance documentation. The technique is also used in long-context LLM inference for handling inputs longer than the model's native context window.

Sliding window with Centralpoint: Centralpoint supports sliding-window chunking in its RAG pipeline, with administrators controlling window and stride per skill. The model-agnostic platform routes generation to OpenAI, Anthropic, Gemini, or LLAMA, meters tokens, keeps prompts local, and deploys retrieval-augmented chatbots through one line of JavaScript.


Related Keywords:
Sliding Window,,