Hello and welcome to another episode of the RDFox introductory series.
In this episode we’ll be covering the foundations of RDFox, starting from the very basics, asking ‘what is a database?’ but we're going to quickly get into the detail of triples, knowledge graphs, and knowledge-based AI, finally pulling it all together with RDFox.
So what is a database well in practical terms it is just a collection of information and what critical is we can retrieve this information on demand.
Now RDFox is a type of database called a knowledge graph.
So what is a knowledge graph?
Well it’s way to store data and knowledge and importantly, it stores that information in what we call a graph structure, something similar to what we see here on the right, where information is represented as node and relationships that connect them.
Why do we do this? Well there are some vital benefits that we can gain over using traditional methods such as relational databases where data is stored in tables or document stores that store documents.
Unlike relational databases there is no fixed schema making graphs much more flexible.
This means they're much better equipped to represent real-world highly interconnected data, particularly when this data is malleable, evolving over time as data is added, removed or changed in some way.
Perhaps the most important benefit, is that of semantic reasoning, of Knowledge based AI, which enables us to encode, automate, and scale expertise.
If we look one layer deeper we can ask ‘what is a trip?’
A triple, also called a fact, is a single unit of information within a knowledge graph, hence the name fact.
On the right we have two examples of a triple effect and the facts thet represent are that OST is a company, and that OST has the full name, Oxford Semantic Technologies.
As you might have guessed, we call it a triple because it is consists of 3 parts, two nodes and a relationship that connects one to the other.
Within the triple, the position of these nodes is important. The first node we call the subject, the second node we called the object, and relationship contenting them is called the predicate.
Notice the predicate is an arrow and always points from the subject to the object.
Any given entity, like OST in our example here, can be the subject of several triples and the object of several more. In fact this is very common and how the structure of our knowledge graph is able to represent complex, real-world relationships.
In the top example we’ve shown an entity on both the subject and object position of the triple, but, while the subject must always be an entity, the object can also be a value, such as a string, number, or date-time, as we’ve shown below.
All of this leads us to the question, of ‘how to query a knowledge graph?’
You'll see on here is our data this is the same thing we've seen before, where OST is a company. We might have several companies like this in our database.
If we want to find all of the companies we know about, we have to describe the pattern of these triples.
So what we've done above is described the pattern something is a company using the variable question mark x, is a, company.
What this query will now search for this pattern within our data.
You can see here the company maps to our company in the data, as does the ‘is a’ relationship and all that’s left is to plug in the variable in subject position which in this case is OST.
The final piece to the RDFox puzzle is of course is semantic reasoning, otherwise known as knowledge based AI.
Knowledge based AI can be used to enrich a graph by adding new information.
The idea here is that we can write rules that capture and encode expertise - real domain knowledge - and what RDFox will do is take these rules to scale, automate, and accelerate that knowledge.
Now there are a few reasons that we want to do this but the simplest is that drastically improves query performance, with the potential for shortening query times by orders of magnitude.
Beyond performance it also simplifies the logic at query time, leading to faster development, and reasoning even offers functionality that goes beyond what is possible with querying alone, creating huge potential for even the most complex solutions.
So, let’s have a look at a simple example of a rule. On the right our rule counts the number of founders per company. Here’s we’ll just look at the effects on OST but with the way we’ve written it, this would apply to all companies in our database.
With this rule RDFox will find our founders and count 12345 and actually add this new fact to the datastore. It’s crucial to understand that the fact OST has number founders 5 is actually added to the store at the point that we add to the rules or add data because whenever we come to query for this fact the computer doesn't have to do any of the work to re-calculate this value as it’s already been done, so the result can be returned near instantly
So if you have a very large amount of data or particularly complex questions to ask of your data, you can move that work ahead of time so when you come looking for the answer you get it straight away.
Another fantastic benefit to rules is incremental reasoning.
This means the answers created by rules always remain consistent with the data in the knowledge graph. In practical terms this means the results will be updated automatically as new data is added or as old data is removed.
For example, if I had missed out a founder in our data over here, I could add them at a later date, and this inferred fact would would automatically update to 6, without me having to do anything. I wouldn't have to manually check for updates or re-trigger the rule in any way, it is fully managed behind the scenes.
So finally what is RDFox? Well RDFox is all of these things it is a knowledge graph and semantic reasoning engine, focused on incremental reasoning, speed, and scalability
We have deployed on everything from device and mobiles all the way up to massive cloud instance covering use cases from semantic search in publishing and e-commerce, configuration management in retail and manufacturing, to autonomous decision making on board vehicles and in medical devices.
Continue watching the series to learn more about Knowledge Graphs, querying, and knowledge-based AI, and get in touch today if you’d like to discuss your use case with an RDFox expert.
The team behind Oxford Semantic Technologies started working on RDFox in 2011 at the Computer Science Department of the University of Oxford with the conviction that flexible and high-performance reasoning was a possibility for data-intensive applications without jeopardising the correctness of the results. RDFox is the first market-ready knowledge graph designed from the ground up with reasoning in mind. Oxford Semantic Technologies is a spin-out of the University of Oxford and is backed by leading investors including Samsung Venture Investment Corporation (SVIC), Oxford Sciences Enterprises (OSE) and Oxford University Innovation (OUI).