A back arrow icon.
RDFox Blog

What is Knowledge-based RAG and why do enterprises need it?

What is Knowledge-based RAG and why do enterprises need it?
Thomas Vout

In this demo we'll be showing you how to enrich RAG with Knowledge-based AI, providing faster answers to more complex questions, all while lowering the cost.

We'll be doing this in the context of a recipes database asking questions about our specific recipes and their ingredients.

But first what is RAG and why do enterprises need it?

RAG or Retrieval Augmented Generation provides an LLM interface to a database. This allows users to take advantage of all of the benefits of LLMs while anchoring their answers in company owned data, providing accuracy, reliability and trust to the powerful, flexible interface. To achieve this the user talks directly to the LLM itself which then retrieves data from the data store, but there are many ways to do this.

One is vector RAG where information is stored in chunks in a vector database. The most relevant chunks are returned to the LLM in order for it to answer the user's question. This approach can answer simple questions but it can be slow, large context windows can be expensive, and it fails with more complex tasks that require an understand of the domain or information about the entire data store.

Graph RAG solves this problem as data is stored in a knowledge graph, returning highly specific information to the LLM in order to answer the user's question. This too has its limits in performance and complexity at which point Knowledge-based RAG is required.

Knowledge based RAG uses Knowledge based AI to enrich the knowledge graph, leveraging rules written by domain experts to expose powerful insights to the LLM. With this enrichment, the RAG setup can perform advanced analytics, offers explainable results and provides optimal query times even for the most complex questions.

So, let's take a look at this in action in our recipes demo that shows vector RAG and Knowledge-based RAG side by side.

Here you can see alongside our traditional faceted search we also have the option to ask our chatbot for some culinary information. For our purposes here we have generated 2 answers based off two different methods, on the left we have vector RAG and on the right we have knowledge-based RAG.

Our first question is simple: ‘recommend me a recipe with fish and lemon’. We can see that both approaches served us an appropriate answer.

We can quickly push the limits of vector RAG by asking something more interesting like: ‘how many nut-free recipes are there in this database?’ We can see the vector RAG gives up as it lacks the information about the entire database whereas Knowledge-based RAG gets the right answer without any trouble.

This particular question highlights another critical power of Knowledge-based RAG: explainable answers. With this approach, we can verify that each of these recipes really are nut-free, meaning we don’t have to gamble with trusting a black box when the consequences could be disastrous.

We can push this system even further still by asking our final question: ‘of the nut free recipes which is the fastest to cook?’ Notice that this time vector RAG does in fact return an answer, but it is incorrect, without any way for us to know, whereas Knowledge-based RAG again returns the correct answer.

Let's have a quick look at some of the reasoning that has helped us answer these questions. With an ontology supporting our data, we have encoded domain knowledge into our system, enabling it to understand details of our specific context and answer the user's questions more accurately.

If we then look at a recipe that has been tagged as a nut-free, we can explain this fact, exposing the rule and the explicit information that produced it. In this case it is as simple as ensuring a recipe does not contain a single nut product but crucially this is performed with logic not just a statistical best guess.

Beyond accuracy reliability and performance, cost is also a factor. Here is a side-by-side comparison of some values taken from our queries generated by our demo. As you can see knowledge base RAG has used approximately 10 times fewer tokens to generate answers than vector rack not to mention that vector RAG often spent this money returning the wrong answer.

So why do enterprises need Knowledge-based RAG? It comes down to speed, cost and above all the ability to accurately answer complex questions.

With simple questions, vector RAG, graph RAG and Knowledge-based RAG will all provide a suitable answer although vector RAG can be far more expensive.

For questions that require total oversight of the data, vector RAG fails, leaving only graph RAG and Knowledge-based RAG, although even here graph RAG alone can be slow.

Finally, when we ask more complex, more valuable questions that require an expert’s understanding with concepts like calculations, negation and aggregation, Knowledge-based RAG is the only choice

So, if you'd like to accelerate your RAG journey get in touch with us today to kick start your POC or to see a more advanced demo.

Take your first steps towards a solution.

Start with a free RDFox demo!

Take your first steps towards a solution.

Get started with RDFox for free!

Team and Resources

The team behind Oxford Semantic Technologies started working on RDFox in 2011 at the Computer Science Department of the University of Oxford with the conviction that flexible and high-performance reasoning was a possibility for data-intensive applications without jeopardising the correctness of the results. RDFox is the first market-ready knowledge graph designed from the ground up with reasoning in mind. Oxford Semantic Technologies is a spin-out of the University of Oxford and is backed by leading investors including Samsung Venture Investment Corporation (SVIC), Oxford Sciences Enterprises (OSE) and Oxford University Innovation (OUI).