How does the system work?

How RAG works

Vector Embeddings

Vector embeddings are a cornerstone technology permeating modern AI and machine learning. These mathematical representations of data are key to understanding and processing complex information in everything from search engines to sophisticated AI assistants. If you're exploring or developing applications in these fields, you'll inevitably encounter vector embeddings. In this section, we'll provide you with an intuitive understanding of what vector embeddings are and how they're revolutionizing data processing and AI applications. For an in-depth technical exploration, check out our article on Vector calculations.

Reducing Hallucinations

In short, we copy and paste data that we know is valid, which is then included in the prompt to the AI. This way, the AI can use this extra information, increasing the likelihood of responding correctly and reducing the risk of hallucinations.

1. Chunking into smaller pieces: Legal information, court database, tax guidance, and other sources are divided into smaller chunks. Each chunk is embedded, meaning they are converted into numerical representations (embeddings) and placed on a unit circle. This allows the system to analyze and process data more efficiently.

How RAG works

2. Reformulating questions: When a user asks a question, a language model (LLM) reformulates the question into three different variations to extract relevant information from the vector database. This ensures that all possible relevant answers are considered. (This is not mandatory but can improve the quality of the answer).

How RAG works

3. Retrieving relevant data: The most relevant chunks are retrieved from the vector database and re-embedded to ensure that the most precise and useful data is used to answer the question.

How RAG works

4. Generating the answer: The language model (LLM) combines the user’s question with the retrieved information and generates an answer that is sent back to the user. This answer is based on both the original query and the relevant data found in the vector database, increasing the accuracy and relevance of the response.

How RAG works