RAG stands for retrieval-augmented generation. Instead of asking the model to answer from training alone, the system first retrieves relevant information from a knowledge source such as documentation, policies, product specs, tickets, CRM notes, or internal files.
Those retrieved passages are then inserted into the prompt. The model generates an answer using both your question and that grounded context. This is what makes enterprise AI agents much more reliable than a plain chatbot with no retrieval layer.
The four-step flow
Why it matters for business systems
In practical business terms, RAG is what turns a generic model into a company-aware assistant. It gives the system access to the right memory at the moment of use, without having to retrain the whole model every time a policy changes.
Many teams stop at prompt engineering and then wonder why the assistant hallucinates, misses edge cases, or forgets internal rules. That is a system design problem. If the knowledge is external, dynamic, or permission-sensitive, it should usually be retrieved at runtime.