Chunking
The process of breaking documents into pieces ("chunks") before indexing them for retrieval. It's one of the most underrated — and most decisive — factors in a RAG system's quality.
Chunks that are too small lose semantic context; too large, and they dilute the relevant information and waste context window. Advanced strategies use semantic chunking (splitting by unit of meaning) or recursive chunking, instead of blindly cutting every N characters.