Benford's Law
eBook - ePub

Benford's Law

Theory, the General Law of Relative Quantities, and Forensic Fraud Detection Applications

Alex Ely Kossovsky

Condividi libro
  1. 672 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Benford's Law

Theory, the General Law of Relative Quantities, and Forensic Fraud Detection Applications

Alex Ely Kossovsky

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

Contrary to common intuition that all digits should occur randomly with equal chances in real data, empirical examinations consistently show that not all digits are created equal, but rather that low digits such as {1, 2, 3} occur much more frequently than high digits such as {7, 8, 9} in almost all data types, such as those relating to geology, chemistry, astronomy, physics, and engineering, as well as in accounting, financial, econometrics, and demographics data sets. This intriguing digital phenomenon is known as Benford's Law.

This book represents an attempt to give a comprehensive and in-depth account of all the theoretical aspects, results, causes and explanations of Benford's Law, with a strong emphasis on the connection to real-life data and the physical manifestation of the law. In addition to such a bird's eye view of the digital phenomenon, the conceptual distinctions between digits, numbers, and quantities are explored; leading to the key finding that the phenomenon is actually quantitative in nature; originating from the fact that in extreme generality, nature creates many small quantities but very few big quantities, corroborating the motto "small is beautiful", and that therefore all this is applicable just as well to data written in the ancient Roman, Mayan, Egyptian, and other digit-less civilizations.

Fraudsters are typically not aware of this digital pattern and tend to invent numbers with approximately equal digital frequencies. The digital analyst can easily check reported data for compliance with this digital law, enabling the detection of tax evasion, Ponzi schemes, and other financial scams. The forensic fraud detection section in this book is written in a very concise and reader-friendly style; gathering all known methods and standards in the accounting and auditing industry; summarizing and fusing them into a singular coherent whole; and can be understood without deep knowledge in statistical theory or advanced mathematics. In addition, a digital algorithm is presented, enabling the auditor to detect fraud even when the sophisticated cheater is aware of the law and invents numbers accordingly. The algorithm employs a subtle inner digital pattern within the Benford's pattern itself. This newly discovered pattern is deemed to be nearly universal, being even more prevalent than the Benford phenomenon, as it is found in all random data sets, Benford as well as non-Benford types.

Contents:

  • Benford's Law
  • Forensic Digital Analysis & Fraud Detection
  • Data Compliance Tests
  • Conceptual and Mathematical Foundations
  • Benford's Law in the Physical Sciences
  • Topics in Benford's Law
  • The Law of Relative Quantities


Readership: Researchers in probability and statistics, forensic data analysis. Key Features:

  • The book is a concise account of all known aspects in practical applications of the phenomenon to fraud detection. It also corrects several errors committed in the field where mistaken applications are used
  • The perceptive reader such as an accountant, an auditor or an official at any governmental tax authority worldwide, interested in knowing about the use of this digital law in fraud detection, would be able to learn about it with ease and with a minimal amount of effort and time, instead of searching through literally hundreds of various small articles on the topic
  • The book provides numerous new theoretical points of view of the phenomenon, new methods for testing data for compliance, and fuses many different aspects of the law into a singular explanation

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Benford's Law è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Benford's Law di Alex Ely Kossovsky in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Mathematics e Probability & Statistics. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Editore
WSPC
Anno
2014
ISBN
9789814583701
Section 1
BENFORD’S LAW
DIGITS VERSUS NUMBERS
The typical statistician, during a typical day at the office, spends most of the time intensely staring at data charts and scatter plots, seeking real or imaginary patterns where perhaps none exist, summarizing data, calculating averages and standard deviations, regressing and correlating seemingly unrelated variables, analyzing subtle variances between related data sets to determine whether they are significantly or randomly different from each other, dissecting and bisecting those pesky numbers sent by clients, government agencies, companies, and research institutes.
Interestingly, the statistician is recently taking on the role of a philosopher of sorts, and instead of examining the numbers themselves as is the standard practice, he or she is investigating the digital language utilized in writing those numbers. What letters are to words, digits are to numbers. Why should a poetry lover seek any patterns or beauty by looking into the letters in Shakespeare’s prose instead of the elegantly combined words? Yet, the relative proportions of our ten digits 0 to 9 occurring within our typical everyday numbers are now being routinely recorded and investigated by statisticians and data analysts, and even theorized as to how exactly they should be spread within any given data set by applying mathematical and statistical reasoning. Moreover, the study of digit proportions is further subdivided by classifying them into different categories according to position. For example, the specific proportions of the leftmost digit, namely the first digit of numbers, is looked into and examined separately. Another separate analysis is performed on the second-leftmost digit, which indeed shows quite different digital proportions than those of the first digit. But aren’t all digits supposed to be occurring randomly and thus equally distributed? Why should the digit 4 for example have a higher or lower chance of occurring within numbers than say the digit 5? One wonders whether the occurrences of digits themselves within numbers are just ‘too random’ for the statistician to even consider and analyze. Is there indeed a particular statistical law supposedly governing digital proportions? In addition, it seems doubtful that there would be any use or consequence in looking into this digital language proportion in the first place. Are there any applications that can exploit the examination of these digital proportions?
The answers to the latter two questions are all decisively positive, as evident by the newly-created role assigned to the statistician recently as a private detective utilizing known digital patterns in data to detect fraud by knowing that fake data probably lacks those particular digital patterns. Previously, the task of the statistician was merely to analyze data, but never to decide on the authenticity of the provided data. Data was traditionally always taken as a given without any ability to authenticate. For how could the unsuspecting, honest and naive statistician know that people were sending him or her fake data that was merely invented? One incentive to fake data and reduce reported revenues and income would naturally be to lower tax payments. Another incentive is the temptation to inflate revenues and profits in order to impress investors and present the company in a better light as being financially sound. Therefore there is a strong need on the part of tax authorities, governmental financial regulatory and supervisory agencies worldwide, as well as auditing and accounting companies and others, to obtain professional statistical advice as to how to detect fake data. By wearing that philosopher’s hat and examining the digital language used in writing the numbers in provided data sets, the statistician is then able to wear his or her other hat, namely the detective’s hat, and forensically analyze data for any possible fraud.
TO FIND FRAUD, SIMPLY EXAMINE ITS DIGITS!
As our civilization progresses, we are able to do things previously thought impossible. Our collective mathematical and technological abilities have reached fantastic heights. We literally perform magic with our computers and other gadgets. But can we perform the simple task of telling when a friend or a spouse lies? Perhaps not, but the truly sophisticated statistician, aware of the latest developments in the field, can nowadays detect straight-faced fraudsters when presented with their fake data. Underpinning this ability is the fact that to concoct authentic-looking data one must know something about the particular properties of their digital language, while most fraudsters haven’t got a clue about the topic, and mistakenly believe that digital equality rules the universe of numbers. Yet in fact, low digits such as 1, 2, and 3 actually occur with very high frequencies within the first-place position of typical everyday data, while high digits such as 7, 8, and 9 have very little overall proportion of occurrence. So much so that the proportion of everyday typical numbers starting with digit 1 is about seven times that of numbers starting with digit 9! About 30% of typical everyday numbers in use start with digit 1, while only about 4% start with digit 9.
In order to illustrate the ability of utilizing this peculiar digital phenomenon in fraud detection, we shall digitally analyze hypothetical accounting data from five different companies where amounts represent revenues. The table in Fig. 1.1 shows 25 dollar amounts from each company. Nothing seems unusual or suspicious if we merely focus on the numbers themselves. Yet, if we forensically investigate the digital language used in writing those numbers, namely the digits at the very beginning of each number (the leftmost ones), we can immediately reveal an abnormality with one particular data set. Figure 1.2 shows the proportions of the first digits for all five companies.
Clearly, MF Capital comes under strong suspicion in the eyes of the expert statistician, since typical accounting data rarely comes with anything near digital equality for the first position. First-digit proportions of the other four companies show an overall pattern of gradual decrease, consistent with the expected pattern in almost all types of accounting data. The set of the first digits for MF Capital revenue data (commas omitted) is {4736281255914389752766432}, which is distinctly different compared to say Alcoa’s {6111119321441128225618431}. Digits at the second and third positions are much more equal in proportions for all five companies and do not show any particular pattern; they also do not single out MF Capital in any way. Had the focus of the statistician been misplaced on those digits, there wouldn’t be any clue about MF Capital’s possible fraudulent activities.
Figure 1.1 Hypothetical Accounting Data for Five Companies
Figure 1.2 1st Digits Proportions of the Data of Five Companies
FIRST LEADING DIGITS
First Leading Digit (LD) or First Significant Digit is the first (non-zero) digit of a given number appearing on the leftmost side. For 567.34 the leading digit is 5. For 0.0367 the leading digit is 3, as we discard the zeros. For the lone integer 6 the leading digit is 6. For negative numbers we simply discard the sign, hence for -62.97 the leading digit is 6. Another way of defining the first digit of any number is by writing it in scientific notation as A*10N with N being an integer and A being a real number such that 1 ≤ |A| < 10. For such representation of numbers, the integral part of A (excluding the fractional part), and with the positive or negative sign ignored, is what we consider the first leading digit. For example, the number 311.75 is scientifically written as 3.1175*102 and digit 3 leads the number. Naturally, when digit d appears first in a number composed of several digits, we call d the ‘leader’, as it leads all the other digits trailing behind it to the right.
EMPIRICAL EVIDENCE FROM REAL-LIFE DATA ON DIGIT DISTRIBUTION
Perhaps it is tempting to intuit that for numbers in typical real-life data sets, all nine digits {1, 2, 3, 4, 5, 6, 7, 8, 9} should be equally likely to occur and thus uniformly distributed. Let us examine three typical data sets from a variety of real-life situations where digital results run counter to that misguided intuition and where, surprisingly, low digits such as 1, 2, and 3 are strongly favored over high digits such as 7, 8, and 9. The three data sets to be digitally examined are: (I) stock market prices and volume of stock traded, (II) the 10 by 10 multiplication table, and (III) house number in typical address data.
Examination of first digits of closing prices and daily volume of stocks traded on the New York Stock Exchange on December 23, 2011 reveals a definite pattern in which digital proportions are almost monotonically and consistently decreasing. The first 31 companies on top of the alphabetically-sorted list were arbitrarily chosen. Figure 1.3 shows the extracted data.
Low digits lead much more often than high digits, for both stock prices and volume. Figure 1.4 shows the exact LD distributions for this limited set of 31 companies. It should be noted that almost all other such subsets down the long list on the NYSE website yield quite similar results, that there was nothing unusual about the trading day of the 23rd of December 2011, and that very similar digital results are gotten on other trading days.
Let us examine LD of the 10 by 10 multiplication table that we all were forced to memorize at school against our will, as shown in Fig. 1.5(A).
Surprisingly, out of 100 numbers, 21 start with the lowest digit 1 (shown in large and bold font), and only five start with the highest digit 9 (shown within circles), namely a ratio of 4:1 roughly. This result is surprising yet approximately compatible with the digital results seen in the example with stock prices and volume data. In this digital analysis the numbers 1, 10, and 100 are grouped together under the same category since all of them are being led by digit 1. Digital proportions here are {21%, 17%, 13%, 14%, 8%, 9%, 6%, 7%, 5%}.
Figure 1.3 Price and Volume of Stocks Traded on the NYSE
Interestingly, if the digital as...

Indice dei contenuti