RAG Chunking Strategies

Good chunking boosts accuracy and lowers cost. Start with 300–800 tokens and 10–20% overlap, then evaluate.

Fixed window
Simple and fast; consistent chunk sizes.
Semantic splitting
Split by headings or boundaries to keep concepts intact.
Overlap
Keep 10–20% to preserve context across boundaries.
Metadata
Capture source URL, section, version/date, access level.

Evaluate and iterate

  • Track answer correctness and citation coverage on a golden set.
  • Monitor token usage and latency as you vary chunk sizes and k.
  • Use re‑ranking for large corpora when recall drops.

Ship higher‑quality RAG

Join Now