RAG Chunking Strategies
Good chunking boosts accuracy and lowers cost. Start with 300–800 tokens and 10–20% overlap, then evaluate.
Fixed window
Simple and fast; consistent chunk sizes.
Semantic splitting
Split by headings or boundaries to keep concepts intact.
Overlap
Keep 10–20% to preserve context across boundaries.
Metadata
Capture source URL, section, version/date, access level.
Evaluate and iterate
- Track answer correctness and citation coverage on a golden set.
- Monitor token usage and latency as you vary chunk sizes and k.
- Use re‑ranking for large corpora when recall drops.