Using Artificial Intelligence (AI) to detect fraud
Improving Governance using Benford's Law
Martin McCool - Bankhawk
Introduction
Improving governance can be a time consuming, costly and labour-intensive process for any company. However, not employing a financial governance process leaves companies exposed to internal and external fraud, poor quality reporting, and regulatory compliance issues. A comprehensive Forensic Data Analysis (FDA) project can be used to provide companies with the governance they require but these may prove to be expensive, labour intensive, and, if it is performed using spreadsheets, open to human error and data manipulation.
How can companies solve these problems and improve their financial governance? Machine Learning tools, included in many ERP systems (SAP Predictive Analytics, Oracle Risk Management, etc.), standalone Machine Learning tools (R for example) and mathematical approaches, such as Benford’s Law, can be used by companies to enhance and improve their financial governance.
In this article we will focus on Benford’s Law and how it can be used to enhance a company’s financial governance.
Benford's Law
Benford’s Law, also known as the First Digit Law, was proposed by Frank Benford, an electrical engineer and physicist, who worked for General Electric, in his paper “The Law of Anomalous Numbers” in 1938. In this paper, Benford notes that the first digit in a set of numbers will begin with one with a much higher frequency than expected. It states, for example, that one will occur around 30% of the time in a large dataset, with each proceeding digit occurring with diminishing probability (Weisstein, 2016).
Benford used numerous data sources in his paper, including: US populations, physical constants, and street addresses. Using these datasets, he defined the probabilities for each of the digits from one to nine. Tests since have shown the accuracy of Benford’s Law, with some caveats, including a test of the first 652,066 numbers in a Fibonacci sequence (Thornton & Long 2016).
Benford's Law and Finance
The use of Benford’s Law for financial analysis has been proposed by Mark Nigrini, a professor of accounting at West Virginia University, in his publication “Benford’s Law: Applications for Forensic Accounting, Auditing, and Fraud Detection”. In this book he shows how Benford’s Law can be used to identify fraud, Ponzi schemes, tax evasion and more (Nigrini, 2012).
Indeed, research from Amiram, Bozanic and Rouen found that companies that significantly deviated from Benford’s Law were more like to be investigated by the Securities and Exchange Commission (SEC) and that once a restatement of their finances was made they complied with Benford’s Law (Amiran, et al., 2015).
Other research using Benford’s Law had also shown that companies that do not comply with Benford’s Law also underperform in the stock market (Saft, 2015).
Benford’s Law does not replace existing FDA processes, but it can be used in conjunction with other tools, to improve those processes or provide an independent analysis.
Benford's Law in Detail
In this section we will look at Benford’s Law in greater detail and how it can improve a company’s financial governance.
The Mathematics behind Benford's Law
When working with large datasets of real numeric data with a first digit in the range one to nine (1-9), then instinctively one would expect that the data would begin with any of these digits 11.1% (1 in 9 times) of the time. However, Benford’s Law shows us, somewhat counter intuitively, that this is not the case. In fact, it shows that the lower value digits occur with a much higher frequency than the higher value digits. The digit one, for example, occurs approximately 30% of the time and each digit after that occurs at a lower frequency than the next. A full list of the probabilities can be found in Table 1.0.
First Digit | Frequency of Occurrence (%) |
1 | 30.10300% |
2 | 17.60910% |
3 | 12.49390% |
4 | 9.69100% |
5 | 7.91812% |
6 | 6.69468% |
7 | 5.79919% |
8 | 5.11525% |
9 | 4.57575% |
Table 1.0 – Benford’s Law Frequencies
Benford’s Law can be stated (when many powers of ten lie between the cut-offs):
Formula 1.0 – A Benford’s Law Formula
The formula (Formula 1.0) was used to generate the values in Table 1.0 (Weisstein, 2016). While Benford’s Law applies to scale invariant data, it can also be applied to numbers chosen from a variety of non-scale invariant sources, with certain caveats. For many years Benford’s Law was considered a phenomenological law with little understanding of how it worked. Indeed, certain datasets appear to conform to a Benford’s distribution, and some do not, which has made formulating a proof for Benford’s Law difficult. However, this may be due to deficiencies in the data as opposed to the law. Hill notes that omissions in the tables of universal constants, for example the force constant one is omitted, and he proposed that perhaps it would conform with a Benford’s distribution if it were “complete” (Hill, 1998).
Benford’s Law, whilst counter intuitive, becomes more intuitive when one thinks of the percentage of change an integer needs in order to increment. For example, one needs to increase by 100% to change to two, whereas nine only needs to increase by 11.1% to increment (Kahn, 2016). This holds true also when we think of numerical data in terms of the first digit, irrespective of the scale of the actual value (Table 2.0).
First Digit | Frequency of Occurrence (%) |
1 | 100% |
2 | 50% |
3 | 33% |
4 | 25% |
5 | 20% |
6 | 16.6% |
7 | 14.3% |
8 | 12.5% |
9 | 11.1% |
Table 2.0 – The Percentage of Change to Increment Integers
If we take the concept a step further, then we can calculate the sum of the percentages of change (x). Then calculate x as a percentage of the sum of x, we see a distinctive correlation with a Benford’s distribution (Table 3.0 and Graph 1.0), which would appear corroborate this hypothesis.
Digit | % of Change (x) | x as % of SUM(x) | Benford’s Law |
---|---|---|---|
1 | 100% | 35.34858% | 30.10300% |
2 | 50% | 17.67429% | 17.60910% |
3 | 33% | 11.78286% | 12.49390% |
4 | 25% | 8.83714% | 9.69100% |
5 | 20% | 7.06972% | 7.91812% |
6 | 16.6% | 5.89143% | 6.69468% |
7 | 14.3% | 5.04980% | 5.79919% |
8 | 12.5% | 4.41857% | 5.11525% |
9 | 11.1% | 3.92762% | 4.57575% |
282% | 100% | 100% |
Table 3.0 – Integer Increment Percentage vs. Benford’s Law
- % of Change - Increment Integers
- Benford's Law
Graph 1.0 – Integer Increment Percentage vs. Benford’s Law
The caveats for when Benford’s Law cannot be applied to a dataset include:
- Small sample sizes
- Truly random numbers
- Datasets with defined minimum and maximum
- Artificially generated numbers
- Data sets involved in human psychology (.99 numbers in retail)
(Kahn, 2016)
The Mathematics behind Benford's Law
Improving financial governance and detecting fraud is a growing concern for corporate treasurers. A EY survey of over 650 executives from multiple industries and sectors showed that Internal Fraud is their highest priority when using FDA. This is higher than Mergers & Acquisition Risk, Money Laundering, Capital Projects Risk and Financial Statement Fraud. The survey also shows that companies are planning to increase budgets for FDA projects (EY, 2016).
Benford’s Law can provide a first step in the world of FDA or bolster existing FDA processes. Mark Nigrini, Professor of Accounting at West Virginia University, has championed the use of Benford’s Law in finance and in his paper Benford’s Law: Applications for Forensic Accounting, Auditing, and Fraud Detection has shown numerous uses for Benford’s Law in finance, including identifying fraud, Ponzi schemes, tax evasion and more (Nigrini, 2012).
Further research from Amiran, Bozanic and Rouen, in their paper Financial Statement Errors: Evidence from the Distributional Properties of Financial Statement Numbers, has shown that, after studying more than 40,000 company reports from 2001-201, that those company records, complied in aggregate or by industry, could be described by Benford’s Law; 86% of which complied with the law. When analysing data from companies that were investigated successfully by the Securities and Exchange Commission (SEC) they found them to have significantly differed with Benford’s Law, however, a later restatement of those finances complied with Benford’s Law (Amiran, et al., 2015).
It is very hard for human beings to create fraudulent transactions that will comply with a Benford’s distribution as human psychology attempts to make data look “real” and “natural” in an intuitive manner will invariably fail to comply with statistical and Benford’s probabilities.
Research by Deutsche Bank has also linked compliance with Benford’s Law to a company’s performance in the stock market. The study, which included Russell 3000 companies and companies in other global markets, found that the law could be applied to all companies’ financial reports, this included balances sheets, income statements, and sales reports. It found that companies that did not comply with Benford’s Law also significantly underperformed in the stock market compared with those that did comply. “Stocks with potential accounting irregularities underperform the market significantly,..” “…more importantly, companies with accounting irregularities exhibit more severe drawdowns, higher volatility, and lower risk-adjusted returns compared to the market portfolio.” quantitative strategists at Deutsche Bank led by Yin Luo wrote in a March report. Deutsche Bank also looked at over 40 years of data from Enron, who went bankrupt in 2001, and found that Enron’s data had large divergences from Benford’s Law (Saft, 2015).
It must be noted that Benford’s Law is not a silver bullet for fraud detection and it, like most other methods and tools, can show numerous false positives; it does, however, provide a comparative view of a company’s finances. Outliers can then be identified, assessed and corrected. Greater visibility in turn provides greater governance, which may then make a company more profitable.
Conclusion
Benford’s Law (also known as the First Digit Law) is a statistical law that defines a probability distribution for real numerical data with a first digit in the range of one to nine. The law works counter intuitively to what one would normally assume, i.e. that any first digit in a dataset would occur one in nine times (11.1%). The law has been proven to work with numerous data sources from varying fields and is applicable to financial data.
Research from Nigrini, Amiran, et al. have shown the applicability with companies’ financial data and that it can be used to identify fraud, Ponzi schemes, tax evasion and more. Their research has found that publicly listed companies’ financial data complies with a Benford’s distribution around 86% of the time and that companies that diverged greatly from Benford’s Law were investigated by the SEC; a restatement of the data afterwards, when reanalysed complied with Benford’s Law.
Research from Deutsche Bank has also shown that of the companies in the Russell 3000, and companies in global markets, which comply with Benford’s Law outperform those companies that have a significant divergence from the law in the stock market.
Companies are increasingly concerned with Internal Fraud and are planning to increase Forensic Data Analytics (FDA) budgets.
Contact us for more information on optimising your banking and payments.