How can we utilise graphs and machine learning technologies to combat money laundering?

This article concludes a three-part series on money laundering. Click here to view part 1 and here to read part 2.

Al Capone, also known as “Scarface”, was the original large scale money launderer. He hid 1.4 billion of today’s dollars through an orchestrated and complex chain of tricks with the major one being cash-based launder mats. When he was convicted in May 1932, the charge was merely for tax evasion. His conviction led to the creation of anti-money laundering regulation.

Money laundering, colloquially known as “washing money”, is just as it sounds. You have dirty money, illegally acquired, that needs to enter the clean economy and erase links to its criminal past. The IMF estimates that 2.5-5% of the world’s GDP is dirty money equivalent to the gross domestic product of Australia. Money launderers have been linked to terrorism financing, are the bedrock on which corruption thrives and a favourite whisper among criminals.

Corruption, for example, generates large proceeds to be laundered. The receiving party is unable to deposit the earnings into a legitimate bank account and goes about obscuring the money flow through a three-step process. The first is placement, getting the cash somehow into the global financial system, then comes the layering, adding obscurity and finally integration of the earnings in a legitimate business. 

For example, a former Guinean minister was convicted in May 2017 for laundering approximately $8.5million in bribes to award a mining contract. The minister moved the money to Hong Kong via an intermediary company in a transaction backed by fake consulting invoices. He then invested some of the money in real estate and luxury goods in the US and concealed that he was a politically exposed person and went ahead to spend more of the money as if it was legally earned.

Why is Combating Money Laundering Hard?

Financial institutions are tasked with finding out the ultimate beneficial owner of transactions. This has been defined as parties that hold an interest, voting rights or benefit from earnings of 25% and above in a legal entity. Institutions have to conduct long KYC (Know Your Customer) processes that involve compliance departments investigating ownership chains whether direct or indirect and checking against entities flagged for money laundering. Once a suspicious individual is found, they go about finding similar patterns in their networks.

Global regulations and compliance is a painful process for financial institutions. Licences can be revoked and major fines incurred for not adhering to global risk and compliance requirements.  Banks in Australia face an up to $21 million in fines for each offence raising the risk of massive settlements with the governments.

This is a tedious task as those hiding money go into extreme and clever means of creating long chains of ownerships by layering transactions, splitting large transactions over time and amounts. Shell companies are closed doors that hide real beneficiaries between multiple layers of corporate structures. The Panama papers were a glimpse into this complex world and it took thousands of journalists months to piece together single chains and many more to identify ultimate beneficiaries.

The rapid development of financial technology such as mobile money transfers, distributed ledgers backing cryptocurrencies add to the many possible layers money can flow through. It is paramount that anti-money laundering efforts do not hinder the development of these technologies.

The scale at which most financial institutions work to make it hard for traditional paper-based compliance and present an opportunity for advanced technology to make repetitive, hard tasks easy.

Enter the Graph

Graphs are a special way to represent data as a set of nodes and edges. Nodes are things in the real world and the edges are the connections and nature of connections between the nodes. Graphs allow us to represent real connected world data in a way that captures the meaningful relationships between entities.

Graphs and anti-money laundering are a match made in heaven. They allow us to find indirect relationships between financial entities. This is very useful in finding the ultimate beneficial owner. A financial institution can load master data records with current and past politically exposed persons (PEPs) and check the transaction history for linkages between them and the PEPs. Clever institutions will include other data sources such as IP addresses, co-owned companies, phone numbers, transactions and addresses to find weakly connected entities and highlight similar profiles.

Once a suspicious pattern has been found, graphs make it very easy to look up similar patterns across their transaction space and resolve other entities using graph database capabilities. By applying algorithms such as page rank and community detection, one can detect entities operating together commonly referred to as money laundering rings.

Structuring aka Payment Chains

A common technique to add obscurity and layers to financial transactions is structuring. Suppose one made $1m via illegal means and wants to place it into a legitimate bank account. One goes about recruiting several associates who deposit smaller chunks of $10k into their accounts and then pay into one account via fake invoices, fake charity donations or shell investments. The original owners finally get all their money back but it is difficult for compliance departments to see the complete picture of where the money originated.

Luckily graphs show all this information in a single view, allowing for investigations to be carried out, once you find a certain suspicious pattern you can find people they are connected to detect accomplices and find similar patterns. See the graph above for an illustration.

The Role of Machine Learning

Machine learning, which is a subset of synthetic/artificial intelligence, is a technique of teaching computer programs to accomplish certain tasks from data. Rather than giving the computer a series of exact instructions an architecture is chosen and data fed into it in a learning process that extracts patterns from data.

Pattern machines are the core activity in anti-money laundering. While graphs are great for investigations, machine learning enables the deployment of automated pattern matchers that can flag transactions or individuals who are matched giving a confidence score as well. This reduces the workload for a compliance department by letting legitimate transactions through and highlighting riskier ones for manual review. Institutions need to be careful that their machine learning models are not over eager resulting in many false positives that block legitimate transactions. It is common for example for anti-fraud algorithms to reject legitimate credit card transactions leading to inconvenience for the shopper and lost revenue for the bank.

Clustering, a special class of unsupervised machine learning algorithms, is really good for finding outliers in the data set. This is useful for high-level analysis of accounts, individuals that are acting vastly differently from other users. This can be a great first step in looking out for money laundering.

Once suspicious patterns have been found, a technique called label propagation can be deployed to find similar actors and help see the big picture. Using such algorithms one can see money laundering rings.

With the recent advancement in natural language processing, machine learning models can be set loose onto news outlets to detect negative press mentions of certain politically exposed persons and correlate that with their financial activities to catalogue evidence of money laundering. 

By combining graph native techniques such as clustering, page rank, weak and strongly connected components organisations can make several currently complex analysis. PageRank can help find more relevant players in fraud rings. Louvain modularity allows finding subnetworks which can make or break an investigation. Paths can be checked for similarities allowing following links back up the chain to deal with structuring and layering complexities.

Machine learning models, unlike human-led compliance departments, do not get tired and work at the speed of light. However, they are not meant to replace them, rather a unity of mission between the compliance officer and a useful algorithm make for powerful defences against ever-evolving money washing tricks.


Money laundering is a burden on the world and more heavily felt by under-regulated developing economies. Corruption, a major contributor, robs billions of young developing world governments and money laundering hinders detection and prosecution of offenders. In the world of the hypergraph, a graph of graphs, money laundering can be detected, stopped and evidence against perpetrators gathered. Here relationships are first-class citizens capturing how people and things are related allowing for complex patterns to be seen. Machine learning working with human experts can allow the detection of these patterns at scale and a significant decrease of this burden.

However, graphs thrive on data, data that is generated by us through billions of mobile money and bank transfers, first-class data that in the wrong hands can be used to show our spending habits, our local social networks and political affiliations. Institutions deploying such advanced technologies need to be wary of the many possible negative consequences. Imagine if the police knocked on your door tomorrow because an algorithm predicted you were laundering money and you did not and spent time and money fighting such a case in court!

Author Bio

Peter Kariuki is a Kenyan software engineer and entrepreneur. He believes technology will be the greatest equalizer in the 21st century. 


Neo4J was used in the creation of illustration graphs above. See Cypher queries used to create the graphs below.

MERGE(p:Person {name: “Other Guinean Minister”})

MERGE(b:BankAccount {name: “Other Hong Kong”})

MERGE(b2:BankAccount {name: “USA”})

MERGE(b3:BankAccount {name: “USA Account 2”})

MERGE(tx:Transaction {amount: 1500000, from: “Guinea”})

MERGE(tx2:Transaction {amount: 100000, from: “Hong Kong”})

MERGE(tx3:Transaction {amount: 199000, from: “Paris”})

MERGE(ip:IpAddress {name: “”})



MERGE(tx) – [:PAID_INTO] ->(b)

MERGE(tx2) – [:PAID_INTO] -> (b2)

MERGE(p)- [:USED_IP]->(ip)

MERGE(b3) – [:FREQUENT_IP] -> (ip)

MERGE(tx3) -[:PAID_INTO] -> (b3)

2 Responses

  1. Wow Peter, thank you for sharing this- I can see how Graphs could be used in so many different fields to represent important flows! Very insightful!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: