Reasoning is probably the most powerful feature and main selling point of RDF Graph Databases.
RDF Graph Databases with advanced reasoning capabilities are believed to be the future of AI, as Mike Tung put in his Forbes article Knowledge Graphs Will Lead To Trustworthy AI:
Semantic reasoning is the ability of a system to infer new facts from existing data based on inference rules or ontologies. In simple terms, rules add new information to the existing dataset, adding context, knowledge, and valuable insights — Oxford Semantic Technologies
This is the first of a series of articles on reasoning with Northwind sample database. In this article we are going to create inference rules to simplify and optimise queries and data management.
We are going to use RDFox, an in-memory high performance knowledge graph and semantic reasoning engine. RDFox uses Datalog rule language to express rules.
It’s a learning by example experience and not much theory will be covered here.
Rules define conditions to be matched in the data in order to infer new triples that become available to queries. They provide a mechanism that allows tailor-made performance improvements to specific queries.
In this section we are going introduce three practical examples (use cases) to explain how rules work.
Each use case will contain an original query, a rule and a modified version of the query that uses the rule, producting the same result.
Original query
The original SPARQL query used to return a list of customers who bought product-61.
PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
# Customers who bought product-61
SELECT DISTINCT # eliminates duplicates in case the same customer bought a product more than once
?customer
?companyName
?contactName
WHERE {
GRAPH kggraph:dataGraph {
?customer a :Customer ;
:companyName ?companyName ;
:contactName ?contactName .
?order a :Order ;
:hasCustomer ?customer .
?orderDetail a :OrderDetail ;
:hasProduct :product-61 ;
:belongsToOrder ?order .
}
}
ORDER BY ?customer
Original query result
By using Property Path we can easily demonstrate the path that needs to be traversed to answer the question.
# Path: customer → order → orderDetail → productPREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
SELECT DISTINCT
?customer
WHERE {
GRAPH ?graph {
?customer ^:hasCustomer/^:belongsToOrder/:hasProduct :product-61 .
}
}
ORDER BY ?customer
That’s quite a long way to answer such a typical question. We want to create a shortcut, which will not only speed up things but also make the query more intuitive and easier to maintain. This is archived by rule 01 below.
Rule 01 — boughtProduct
A rule that defines which product was bought by a customer.
Rule definition
PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
[?customer, :boughtProduct, ?product] :-
[?customer, a, :Customer],
[?order, a, :Order],
[?orderDetail, a, :OrderDetail],
[?product, a, :Product],
[?orderDetail, :hasProduct, ?product],
[?orderDetail, :belongsToOrder, ?order],
[?order, :hasCustomer, ?customer] .
Add rule to the data store
There are many ways of adding rules to an RDFox data store. The following example uses curl through a REST API.
curl -X POST -G --data-urlencode "default-graph-name=http://www.mysparql.com/resource/northwind/graph/dataGraph" -H "Content-Type:" -T "rules/01-customer-bought-product.dlog" "localhost:12110/datastores/Northwind/content"
Modified query
The original query was modified to consume the new rule we have just created. The modified query produces the same result.
PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
# Customers who bought product-61
SELECT
?customer
?companyName
?contactName
WHERE {
GRAPH kggraph:dataGraph {
?customer a :Customer ;
:boughtProduct :product-61 ;
:companyName ?companyName ;
:contactName ?contactName .
}
}
ORDER BY ?customer
Modified query result
Since version 5.6, it’s possible to highlight reasoning on the RDFox web console. The following shows the new derived fact, which is materialised in RDFox as a new triple in the graph.
Original query
Lists the top 5 customers by product count.
PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
# Top 5 customers by product count
SELECT
?customer
?companyName
?contactName
(COUNT(?product) as ?count)
WHERE {
GRAPH kggraph:dataGraph {
?orderDetail :hasProduct ?product ;
:belongsToOrder ?order .
?order :hasCustomer ?customer .
?customer :companyName ?companyName ;
:contactName ?contactName .
}
}
GROUP BY ?customer ?companyName ?contactName
ORDER BY DESC(?count)
LIMIT 5
Original query Result
Create Rule 02 — hasProductCount
The following rule defines relations based on the result of an aggregate calculation.
Rule definition
PREFIX : <http://www.mysparql.com/resource/northwind/>
[?customer, :hasProductCount, ?productCount] :-
AGGREGATE (
[?customer, a, :Customer],
[?order, a, :Order],
[?orderDetail, a, :OrderDetail],
[?product, a, :Product],
[?orderDetail, :hasProduct, ?product],
[?orderDetail, :belongsToOrder, ?order],
[?order, :hasCustomer, ?customer]
ON ?customer
BIND COUNT(?product) AS ?productCount
) .
Add rule to the data store
curl -X POST -G --data-urlencode "default-graph-name=http://www.mysparql.com/resource/northwind/graph/dataGraph" -H "Content-Type:" -T "rules/02-customer-has-product-count.dlog" "localhost:12110/datastores/Northwind/content"
Modified query
PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
# Top 5 customers by product count
SELECT
?customer
?companyName
?contactName
?productCount
WHERE {
GRAPH kggraph:dataGraph {
?customer :hasProductCount ?productCount ;
:companyName ?companyName ;
:contactName ?contactName .
}
}
ORDER BY DESC(?productCount)
LIMIT 5
Modified query result
The following illustration highlights the inferred facts (in cyan) as a result of rules 01 and 02 .
Use Case 03 — customers who never placed an order
Original query
Lists the customers who never placed an order.
PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
# Customers who never placed an order
SELECT DISTINCT
?customer
?companyName
?postalCode
?city
?country
WHERE {
GRAPH ?graph {
?customer a :Customer ;
:customerID ?customerID ;
:companyName ?companyName ;
:city ?city ;
:country ?country .
OPTIONAL { ?customer :postalCode ?postalCode } .
OPTIONAL {
?order a :Order .
?customer ^:hasCustomer ?order .
}
FILTER (!BOUND(?order))
}
}
ORDER BY ?customer
Original query result
Create Rule 03 — CustomerWithoutOrder
Negation as failure is a very powerful feature of rules in RDFox.
Rule definition
PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
[?customer, a, :CustomerWithoutOrder] :-
[?customer, a, :Customer], # All customers
NOT EXISTS ?order IN (
[?order, a, :Order],
[?order, :hasCustomer, ?customer] # Only customers who placed orders
) .
Add rule to the data store
curl -X POST -G --data-urlencode "default-graph-name=http://www.mysparql.com/resource/northwind/graph/dataGraph" -H "Content-Type:" -T "rules/03-customer-without-order.dlog" "localhost:12110/datastores/Northwind/content"
Modified query
PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX kggraph: <http://www.mysparql.com/resource/northwind/graph/>
# Customers who never placed an order
SELECT DISTINCT
?customer
?companyName
?postalCode
?city
?country
WHERE {
GRAPH ?graph {
?customer a :CustomerWithoutOrder ;
:customerID ?customerID ;
:companyName ?companyName ;
:city ?city ;
:country ?country .
OPTIONAL {?customer :postalCode ?postalCode} .
}
}
ORDER BY ?customer
Modified query result
The following illustration highlights the derived facts (in cyan) as a result of rule 03.
And, finally, the following highlights (in cyan) the derived facts as a result of all previous rules created so far.
Let’s see what happens if a CustomerWithoutOrder places an order.
When we execute the modified query a second time, :customer-FISSA is not returned. That’s because the derived fact CustomerWithoutOrder was retracted when that customer placed an order.
And, what if we delete rule 03 from the Northwind data store altogether?
Then, query 03 will not produce any results.
We started our journey with a simple demonstration on how inference rules can enrich an existing triplestore. We are planning to extend the reasoning capabilities of the Northwind sample database by adding axioms, an ontology and additional rules to answer more complex questions. Stay tuned!
If you choose to run the queries in this demonstration, please follow the steps below to set up the demo environment.
The following github repository contains the sample data, queries and rules used in this demonstration.
Request an RDFox license here. You will need a commercial or academic email.
Download the appropriate version of RDFox onto your machine.
Copy the license file RDFox.lic to the directory where the RDFox executable is located.
In a terminal, from the same directory above, execute ./RDFox sandbox on MacOS/Linux or RDFox.exe sandbox on Windows to launch RDFox.
MacOS Only
If you get a warning message saying that RDFox is not from an identified developer, click Cancel.
Go to System Preferences > Security and Privacy > General Tab and then click on Allow Anyway, as illustrated below and run the sandbox command again.
If you get another warning message, choose Open to start the RDFox shell.
If everything goes fine, you should get the following message in the terminal:
In the Shell, execute the following to expose the RDFox REST API, which includes a SPARQL over HTTP endpoint.
MacOS only
if you get the following message, choose Allow.
You should get the following message: The REST endpoint was successfully started at port number/service name 12110 with XX threads.
At this point you should be able to navigate to the RDFox web console at http://localhost:12110/console/
On the Console UI, click on + Create data store and name it “Northwind”.
Cancel the Import Content popup as we need to create a graph before importing the data.
Execute the following query on the RDFox web console to create the dataGraph where we are going to store the data and rules.
From … Menu, choose Add content
Select dataGraph from the drop down and then select the northind.nt file under the nortwind/data directory in your local branch or download it from github repo.
You should get a confirmation message saying that 30780 facts were added to the data store.
Now, go to the beginning of this article for the instructions on how to create rules and run the SPARQL queries.
Once you are done with this demonstration, you can stop the RDFox Server by executing the command quit in the original terminal window.
References:
Exploring an RDF Graph Database