Using the shell is the simplest way to get started with RDFox, but having to manually input each command every time you want to use it can become a hassle. Thankfully, the RDFox shell can process text files with pre-written commands and execute those commands just as if you typed them one by one. This can save you a lot of time and make the process of restarting your work much simpler.
In this article we will talk about what scripts are and how you can use them to improve your RDFox workflow. You can find the accompanying Github repository here.
RDFox is a highly optimised knowledge graph and rules engine, designed from the ground up with performance and reasoning in mind. Its unique in-memory approach gives it unmatched speed and its advanced semantic reasoning capabilities make it the perfect fit for any demanding production environment.
A script is a text file containing RDFox shell commands. For example, it might look a bit like this:
Each line is a separate shell command and comments are introduced using the hash (#) symbol.
If our script is located in <working_directory>, we can run it on startup of RDFox using:
(on Mac/Linux) or
Notice that the path for the start script is relative to the <working_directory> and that we can omit the .rdfox file extension. Your file can be of any plain text format, but other extensions will have to be included.
Alternatively, we can run any script at any point after starting the RDFox shell by just typing in the path to it (again, relative to the <working_directory>). Note that means we can execute scripts within scripts.
With scripts you can easily maintain order in your files and simultaneously retain the ease of jumping back into your workflow. For example, if your <working_directory> looks like this:
you can have one main script you run each time you restart RDFox:
Notice we do not specify where our script files are located — the secret to that is modifying the settings:
These commands change the directories where RDFox searches for given types of files.
It is often beneficial to split your rules and scripts into multiple files, each with a very specific purpose, so that you can easily modify and reuse them when needed. In particular, setting up a data source can require quite a long script, so it might be better to separate it from other script files.
Just like in the shell, there are two ways to run queries in scripts. One is to use the answer (for read queries), update (for write queries) or evaluate (for any type of query) command and pass a query file to it. This is usually easier to write as we do not have to worry about line breaks.
The other way is to simply insert your query directly into your script:
Notice though that we had to add backslashes (\) at the end of our lines to tell RDFox we were not finished with our input yet. This can become a hassle for long and complex queries when we need to make adjustments.
If we want to save our results, we can write our query answers directly into a file by modifying our output settings, for example:
Here, we first set our desired output file and format, then query the data store and return to the previous settings. Note that although we do not have to create the output file beforehand, the directory it is placed in (in this case <working_directory>/output, as set in settings.rdfox) must already exist.
Suppose we want to create a more universal script for writing query results to csv files. It could look something like this:
The $(<n>) notation used here indicates script parameters. When running the script, we can add 2 values that will replace them during execution, for example:
This can be very useful if we reuse similar yet not quite identical code at many points in our scripts.
With RDFox, you can control the way it handles your requests. Instead of importing files or answering queries one after another, it can do these things in parallel. In order to achieve that, we pass multiple files to the same import or answer command:
How many threads RDFox uses is controlled by the threads <n>command. By default, this will be set to the number of logical processors on your machine.
Now, suppose we would like to ensure that some operations are either executed together or not at all. The concept that helps us with that is RDFox transactions.
A transaction is a sequence of commands that starts with the begin keyword. It has an optional parameter specifying the type of transaction to be started — either “read/write” (write, the default), “read-only” (read) or “interruptible read-only” (interruptible-read). You can end a transaction using commit (if you want to keep the changes made in it) or rollback (if you want to discard them).
Unlike normally, in transactions rules are not materialised immediately, and although if we run any query materialisation happens automatically before we are given an answer, sometimes it might be beneficial to trigger it manually. That is where the mat command comes in — when used inside a transaction, it will materialise any new rules and give you some information about what exactly has changed
This can be useful for troubleshooting.
Scripts can be a great help in the development phase of your project because when going through multiple iterations of rules, it is often easier to start from scratch than remove their old versions each time.
Scripts are also useful when setting up a production environment with RDFox running in daemon mode, as we can use a single command to start RDFox in shell mode and execute a script (that ends with switching to daemon mode) on startup of our machine:
We can, of course, run RDFox with persistence and have our data available after each restart automatically. That said, running in-memory only (with persistence turned off) can offer a performance benefit, and even if we do decide to turn it on, shell variables (such as prefixes, endpoint and directory settings, or active datastore) are not persisted and it is still recommended that we create a restart script to set these automatically.
You can request a free 30-day evaluation license here.
Read more about RDFox here or over at our blog. If you have any questions or would like to schedule a consultation session with one of our Knowledge Engineers, you can always get in touch with us or email us at firstname.lastname@example.org.