Definition
Tokenization is the process of splitting a text into smaller units (tokens) that the AI model can process. Different models use different strategies: some split by words, others by subwords. Understanding tokenization helps estimate API costs (calculated per token) and optimize prompts to stay within the context window limits.
Related terms
EXPLORE