Next episode: click here
Hello and welcome to another episode of the intro series of RDFox.
In this episode we’ll be covering the foundations of knowledge-based AI, otherwise known as semantic reasoning, including why reasoning is needed, the difference between OWL and datalog, and what each can do for you.
So, why do we need reasoning?
Well it typically boils down to the fact that our data cannot answer the questions that we want or in the time that we needed to.
This is usually caused by one of two reasoning, either the data is incomplete or the in the wrong shape to answer the questions we’re interested in, the answers are there but they require context and human understanding to be extracted.
We can see some simple examples of this in the picture to our right where we can see a family of formula one drivers represented.
The two people at the top this is diagram Emerson and Wilson Fitipaldi.
We can see from our data that Wilson has sibling Emerson but what if we asked the question who is Emerson's sibling?
Well the system would say that Emerson has no siblings because our data is incomplete as this relationship only points in One Direction.
Let’s also ask who is Wilson's brother and again the system would say Wilson has no brothers this is because the system does not have this context.
So in order to answer these questions we have to encode our knowledge and understanding in order to give it to the system, which will then scale and automate our human expertise.
We do this with reasoning.
In RDFox, there are two primary methods to perform reasoning.
The first is the standards-based OWL 2 RL which exclusively provides ontological reasoning, and the second is the more expressive and powerful Datalog, which more closely resembles SPARQL supporting it’s powerful functions like filers, aggregates, negation, and binds that our users find extremely helpful in solving their real-world problems.
First let's have a look at OWL, the web ontology language.
OWL is a standard created by the W3C in large part thanks to the contributions of our founder Professor Ian Horrocks who won the Lovelace medal for his contributions in standardising this language.
So whilst we and many of our clients believe datalog is better suited to solving the more complex, real-world problems, with it’s advanced functionality and syntax, we certainly have an appreciation for OWL, and in practice, we often use both together.
RDFox supports a shard of OWL called OWL 2 RL, the rules language profile.
This is used to express axioms which, at their core, are simply statements of truth - assertions about classes, properties and individuals.
Just as facts, are true statements, so are axioms, except axioms are more general.
Where facts mention a specific entity and a single relation within the graph such as a driver and their given forename, axioms describe a more general piece of information, perhaps about all entities in a class, like saying all drivers are people.
RDFox takes our ontology and uses it to add that information to the graph.
Thinking about the Fitipaldis, this could be ensuring that the sibling relationship is appears in both directions – actually adding the new data to the graph.
However OWL doesn’t support everything our clients need to capture the full extent of their expertise some of the most useful being functions like filtering, aggregation negation, and binds.
For that, we have Datalog.
Its syntax is much simpler and much more expressive than OWL 2 RL, sharing similarities to SPARQL that reduces the barrier to entry and allows more complex patterns to be captured simply, binding in variables as is done with queries.
With each of these pieces we can truly encapsulate domain expertise.
We’ve already mentioned that we could add symmetry to our sibling relationship with OWL, but we could also do this simply with Datalog. But let’s look at some more interesting rules.
We could tag a person as being an only child if we found they had no siblings.
We could count the relatives each individual has.
We could even go as far as to compute the transitive closure of a family tree, finding everyone who is related to each individual by some arbitrarily long chain of relationships.
Whether with OWL or with Datalog, the results of encoding your domain expertise are actually added to the graph, doing the hard work ahead of time so that at query time, the results can be returned in an instant, accelerating queries by order of magnitude.
Reasoning is also incremental, meaning as new data is added, or as old is removed, the consequences of our rules and axioms are updates automatically and live. This means our system is always self consistent, we don’t have to reload the datastore or manually re-trigger rules, RDFox takes care of all of it for us.
If you’d like to learn more about how to write OWL and Datalog in RDFox, check out the other videos in this series on reasoning.
The team behind Oxford Semantic Technologies started working on RDFox in 2011 at the Computer Science Department of the University of Oxford with the conviction that flexible and high-performance reasoning was a possibility for data-intensive applications without jeopardising the correctness of the results. RDFox is the first market-ready knowledge graph designed from the ground up with reasoning in mind. Oxford Semantic Technologies is a spin-out of the University of Oxford and is backed by leading investors including Samsung Venture Investment Corporation (SVIC), Oxford Sciences Enterprises (OSE) and Oxford University Innovation (OUI).