AWS (Amazon Web Services) has utilized the strengths of RDFox to help in the fight against financial crime. Combining the advanced performance of the RDFox reasoning engine with the resources available in the AWS cloud, Amazon has created a fraudulent transaction chain detection solution that is both scalable and fast enough to combat financial crime.
The premise of the solution is to detect chains of malicious transactions that connect the accounts of two known suspicious parties and span across multiple intermediary accounts or middle-men (known as mules) who would otherwise seem perfectly legitimate. We have previously given details of a similar solution in which RDFox is used alone to isolate fraudulent chains in near real-time which we encourage you to read for both a technical overview and to see the possible performance of such a system.
Graph databases such as RDFox have a unique advantage over their relational counterparts in use cases such as this. Amazon chose to use a graph in this instance because SQL databases offer a notoriously cumbersome approach to deeply interconnected data such as financial transactions of a population. For example, if you were to replicate our solution in SQL, it would require multiple SQL queries — one for each link of the detected chain. Each of those SQL queries would require a number of joins equal to at least the number of transactions in the chain. When iterated over many chains with many transactions, this is clearly not an appropriate approach. Graphs on the other hand can represent this sort of data very well, storing relations much more naturally (and efficiently) than tabular databases. Semantic reasoning can then efficiently find the chains unravelling the hidden fraud buried within the data.
“The combination of Amazon Web Services and the RDFox engine results in an automated, scalable, and cost-effective [solution].”
- Zahi Ben Shabat, Amazon Neptune
Why did AWS choose RDFox? There are of course other graph databases available, but no other option can provide the performance or the stability that RDFox can. The real answer, however, is reasoning. The semantic reasoning intrinsic to RDFox is truly unique; reasoning in this form does not exist in any other system. The efficient and scalable inference allowed AWS to transform the massive dataset into useable information that could then be queried and analysed effectively.
Two sets of Datalog rules were used to reason over the data.
The first identifies suspicious transactions as those involving a suspicious party, initiating a seed from which a chain can grow. The chain is then extended step by step to include subsequent transactions that emanate from the receiving party of the first. Finally, if the chain terminates with another suspicious party it is tagged as a suspicious chain.
The second set builds upon the first, tagging chains as high-value where the minimum transferred amount is above a specified threshold.
These serve to highlight chains that transfer a significant figure from one fraudster to another, capturing all of the relevant information in the process. These rules form a simple model but can be made vastly more complex as required.
RDFox ran as a container inside a cluster equipped with an EKS Autoscaler, while the data was stored in Amazon S3 buckets. The intention of this configuration was to provide the following workflow for a Financial Investigator:
1. A set of rules are defined by the investigator and transactional data is fed into RDFox as it becomes available.
2. Once the data has been preprocessed using the semantic reasoning of RDFox, the Investigator specifies the system requirements and any other configurations. The appropriate computing resources will be assigned automatically, and the detection task will begin.
3. The Investigator is then notified by email when the results are finalized. The lag between input and results will vary depending on the size of the dataset but due to the high-performance reasoning of RDFox, this time is greatly reduced and will likely be in the order of minutes or hours for a real-world system — an infrequent process that can take days with other systems.
4. The Investigator can add additional data or rules at any time without requiring a reboot. The incremental reasoning of RDFox will ensure the system retains consistency and the output is updated accordingly.
The true cost of financial crime doesn’t just come from the transfer of wealth, its effects ripple through the economy and impact us all. Frequently financial institutions are tasked with bringing an end to the crime and are subject to hefty fines in cases where fraudsters slip through the cracks.
With RDFox, AWS has engineered a safety net to catch those who may otherwise make it through undetected, offering a way for banks to avoid millions in fines and provide one more layer of protection to the public.
Head over to the AWS blog to learn more about our joint solution, or stick around to dive into the details of our solo venture, How to Detect Money Laundering with Knowledge Graphs and Reasoning.
The team behind Oxford Semantic Technologies started working on RDFox in 2011 at the Computer Science Department of the University of Oxford with the conviction that flexible and high-performance reasoning was a possibility for data-intensive applications without jeopardising the correctness of the results. RDFox is the first market-ready knowledge graph designed from the ground up with reasoning in mind. Oxford Semantic Technologies is a spin-out of the University of Oxford and is backed by leading investors including Samsung Venture Investment Corporation (SVIC), Oxford Sciences Enterprises (OSE) and Oxford University Innovation (OUI).