This article aims to provide a basic introduction to SPARQL with RDFox. It will explain what SPARQL is, why RDFox uses SPARQL, how to write SPARQL queries, how to monitor SPARQL queries with RDFox, and other SPARQL related resources.
SPARQL originally stood for ‘Simple Protocol and RDF Query Language’. However, SPARQL turned out not to be so simple... so the acyronym was changed and the new recursive definition is ‘SPARQL Protocol and RDF Query Language’.
SPARQL is designed and maintained by the W3C and is the standard query language for RDF graph databases. SPARQL provides a similar function to SQL for relational databases, but for graph databases, by enabling information within the RDF graph to be queried.
RDFox is an in-memory, high-performance, knowledge graph and semantic reasoning engine. Given by the name, at its core RDFox is an RDF graph database. It is optimised for speed and reasoning, which are unmatched in power. RDFox supports flexible, incremental addition and retraction of data, and incremental reasoning. It was designed and mathematically validated at the University of Oxford.
SPARQL is the standard query language for RDF graph databases. By using a standard language, RDFox users are not being locked into a proprietary language, allowing more flexibility for the user. Additionally, the large amount of documentation and support on the language means it is widely accessible by users and prevents barriers to RDFox’s application.
RDFox supports most SPARQL 1.1 query language features, with the exceptions listed here, as well as the most commonly needed parts of the SPARQL 1.1 update. RDFox supports INSERT and DELETE query operations, which can be used to remove/add triples from/to the Knowledge Graph Store. RDFox also implements a few proprietary built-in functions that are not in SPARQL 1.1.
RDFox supports all four output formats for SPARQL results: XML, JSON, CSV and TSV.
A SPARQL query is a request for data or information from a knowledge graph. Queries retrieve and model data stored within RDFox. SPARQL queries can also contain subqueries and filters which enable the user to make their query more specific, ultimately improving their ability to obtain information from the RDF graph database.
You can interact with RDFox through the command line interface, referred to as the shell, or through a web interface, named the RDFox console. The shell can also be accessed through an integrated development environment (IDE) for example, Visual Studio Code.
Queries can be typed or pasted directly into the shell or in the RDFox console. The RDFox console is a user interface which was launched with Version 3 and has seen many improvements since! It is optimised for query writing and provides graph visualisation. At present, the RDFox console can be used for SELECT queries only, whilst the shell can be used for all query types outlined in this article.
The following images show firstly, the RDFox console with a SELECT query which is requesting RDFox to return all triples in the graph database. Secondly, graph visualisation of the query result, returning all triples from the Family Guy data store. Data found in the Getting Started Guide, Webinar and documentation:
There are different types of SPARQL queries, for example: SELECT, INSERT, DELETE, CONSTRUCT and ASK.
SELECT, CONSTRUCT and ASK queries do not modify the graph database, they only return the information to the user, whilst INSERT and DELETE queries modify the graph database.
SELECT queries select all data which matches the WHERE part of the query. For example, this SELECT query will bring back the information on each person’s forename in the data store.
This returns the following answers:
Note Brian is not returned as he is not given the type :Person, as he is a dog.
Similarly to a SELECT query, the CONSTRUCT query finds all data which matches the WHERE part of query. However, rather than merely returning the information like in a SELECT query, at the point where the match is found, new triples are constructed. However, these are not added to the graph database as new triples.
For example, in the following query, if the child has a parent, and that parent has a sibling, then that parent’s sibling, depending on their gender, will be determined by RDFox to be the childs aunt or uncle.
For uncle, the construct query would be:
And for aunt it would be the following:
The ASK query, asks whether or not a pattern is found within the data store and prints out the total number of matched solutions.
For example, the following query asks if lois has the child stewie.
This ASK query will show that the pattern has been matched, as the triple which states that Lois has a child called Stewie is within our data store.
If we ran the query:
There would be no match, because the pattern is not stored within the data. Lois does not have a child called Brian, instead Brian is their dog.
The INSERT query adds facts to the data store. For example, we can modify the data store to make the married statement symmetric. The original data store contained a triple stating that Peter is married to Lois, but not that Lois was married to Peter. This query will then insert a triple stating that Lois is also married to Peter.
Now both of the following triples are stored in the data store:
A DELETE query can be used to remove facts from the data store. For example, we can delete the triple that says Stewie has the parent Lois from the data store with the following:
For tips and best practise on writing queries see our article. To watch queries in action, check out our Getting Started Webinar which runs through step-by-step how to implement SELECT, INSERT and DELETE queries using the Family Guy dataset.
Built-in functions specific to RDFox, expand the functions available to RDFox users. These include:
For more information on what these built-in functions do, read the documentation.
RDFox allows the user to monitor the execution of queries and provides useful statistics about the execution of queries, through access to query plans. Suppose that we initialize a data store with the example data in our Getting Started guide. The following shell command provides access to the query plans produced by RDFox:
Now, let’s issue the following SPARQL query against the store:
Which returns the following answers:
The shell now also displays the query plan that has been actually executed:
The query plan is executed top-down in a depth-first-search manner and we can think of solution variable bindings as being generated one-at-a-time. It is useful to go in more detail through the execution of the plan for a given solution binding.
For more complex queries, including using the OPTIONAL operator in SPARQL read the documentation.
Although RDFox can execute SPARQL queries in record speed, uncontested by other graph databases, the most accomplished users of RDFox also use rules written in Datalog. Rules allow for the materialisation of facts prior to query-time — making SPARQL queries in RDFox even faster! To learn how to write rules in Datalog, read our tips and tricks article.
There are numerous resources online to help you with writing and understanding when to use SPARQL queries. For example, the W3 provide information on SPARQL endpoints, common SPARQL extensions, the SPARQL specification and a working group on SPARQL.
If you, or anyone within your organisation, are interested in participating in a SPARQL tutorial with Oxford University Professors and Oxford Semantic Technologies’ founders, email email@example.com.
We regularly run open classes for free so check out our events page to see any that are coming up!