# Benford’s Law – How mathematics can detect fraud!

Skip to content
# Benford’s Law – How mathematics can detect fraud!

##
100 thoughts on “Benford’s Law – How mathematics can detect fraud!”

### Leave a Reply

Career In Forensic Science

Not sure I followed all that. I had a different answer in mind. I would have guessed that the effect comes from the fact that you always start at zero. If you pick any 2 random numbers larger than zero, the smaller one is more likely to start with a one than another number. If the largest random number is some combination of nines, like nine hundred and ninety nine, then all the numbers equal or below it start with all numbers with equal frequency. But if the highest random number is anything less than double that, then the number of one starters goes up to more than half, and if the number is anything more than nearly 0.9 of that all nine number, then it is eliminating all the alternatives. On average, the largest number will be five or less, so half the time, the chance of the smaller number starting with a six or more is zero, and the chance of a one is (25.11r)% where r repeats the number for the magnitude. So a large of five hundred would have a hundred hundreds, ten tens, and a one, or (25.11r)% of all numbers that the small number could be that start with one.

A fifth of the time the large number will start with a 2, which means a hundred and eleven 1 starters out of a maximum of 300 including 0. Every large number between the first 2 and eleven percent below it below it have better than average odds of being a one start. So no matter what the large random is, the small almost always have a better or equal shot at starting with one, and no other number will ever actually have a higher chance. The only way you can beat this is by, as you said, narrowing the data pool, by including a third random number to be the minimum instead of zero, and by ruling it so the largest number cannot exceed smallest number's magnitude. Something like human height in feet is like this, the minimum, or maybe lower quartile, is not zero, and the maximum/ upper quartile does not leave the order of magnitude, so the chances of starting with a one are incredible small.

But if you were to guess the value of any random thing when you don't know what it is, the chances are that largest it could possibly is not all nines, in which case the actual size will be more likely to be a one starter than at least a nine, and whatever the max is will never raise the probability of a starter above the one.

Put simply, if there's an upper limit, it excludes non ones last, and real world things always have an upper limit.

Ningis? That's fiddling small change, surely?

You scared the crap out of me at the start of the video Prof. G

Anyone want to take a stab at, say, the distributions of, say, the reported county by county vote counts in MI, WI and PA in the 2016 presidential election? How about the other states as well. This should be interesting 😉

We can actually determine whether there is a god by checking that nature's probability follows the bell shape curve, if everything in nature is bell shaped curved, there is no god and things just happen according to nature's frequency, however, if there is a nature thing that do not follow the bell shape curve, then god has a hand in it and there is a god

If Bedford's law holds universally, when the example of physical constants is given, the number that turns up is ~40%, I imagine if that means that we are yet to find much more constants so that the proportion of constants starting with 1 will be closer to 30%

Holy crap, there's a guy in china who is either 30cm tall or 3m tall?

Does Benford's Law remain true for different number bases? If you took data that conformed to Benford's law in Base 10, and converted it to Base 7, or Base 9, would it still conform to the law?

"that one guy in china"

Do this for county by county per state in the US presidential election 😉 Very revealing.

Using the FT runs a serious risk of selection bias. The price of sterling in dollars starts with 1 and has for decades. There may be whole columns or even sections containing only such numbers. Now consider the corresponding page from the FT in the period 1915 to 1975: there would be zero such numbers.

thanks for the fraud advice :p

Step 1: Add this to the video's link at the top of your screen: &t=1

Step 2: Keep pressing F5

It seems like this concept could be exploited for data compression.

I've done so many practice problems in my AP Stats class that have to do with Benford's Law..

Curious. Is it possible that numbers used by people tend to begin with low numbers for psychological reasons? E.g. in setting quantities or prices or purchase amounts. So they aren't random.

Pi and Euler itself… is a fraud 😀

Why all the shouting? Couldn't you be closer to the mic? And what happened to the brown paper?

3:24

TRIIIGGGGGGGGGGGGEEEERRRRRRREEEEEEEEEEEDDDD

24 seconds before I realised this was from 2011. Up until that point I was wondering what James was getting so damn animated about.

you forgot to circle Bob Dylan, he's number 1

Isn't this obvious? Of course 1 turns up more. We start at one and count forward. The lower numbers will always appear first and we cut things short before we get to a higher number. We always use smaller numbers more and then increase from there slowly. It's why small change and small notes get passed around more than higher notes ie. $5 vs $100 note usage.

I still didn't get the log stuff though. I'm so used to linear number counting that logs confuse me 🙁

Have you guys ever heard the "the law of near enough"? This is a great example for it. #Vsauce lul

Who wrote these subtitles ?

Well explained. Thanks!

But what about Portugal's plea for bail-out?

holshit i feel like Ive watched magic show. Amazing because I trust the reality of this channel but I have no idea how logs actually create this thing. Feeling dumb lol

This was explained very good. Thank you

wtf math

Awesome video ^_^ This might be a stupid question, but I wonder if part of the reason for the high percentage of 1-s in any number has anything to do with the psychological aspect of the whole? Because when we use data, a lot of times we compare something to something whole. Similarly like how the trigonometrical circle works 🙂 It has a radius of 1. Now of course you have infinite possible values for different sin cos tg ctg asin atg etc… but most of them are going to contain 1, because you use 1 circle with radius of 1 and you compare all the numbers with this whole system, if that makes sense…

I see Portugal, I upvote

That probability of hitting anything with a dart assumes you don't miss entirely

This is amazing

my paypal has exactly 99c in it. you've failed me this time, math.

85% of made up statistics are multiples of 5

75% of made up statistics are multiples of 25

60% of made up statistics are multiples of 10

50% of made up statistics are "50%"

n1

Wow the Chinese are really tall, wait..

I looked at the views in recommendations by youtube on the right side of the window. There were 20 videos and 6 of them began with a number 1 ! That is exactly 30%

Now I am unhappy and unsatisfied that it actually works XD It takes some fun out of life doesn't it!

James makes this show worth watching.

Impossible! 100% of numbers start with 0!

I just graduated college(Comp Sci) and I didnt know wth a log actually was until you explained it. The fuck…

I wonder if some kind of similar appears in other bases…

which numbers follow benfords law and which are "too random" or are sampled from a distribution that is "too specified" ?

Dear diary, today I learned how to commit fraud.

"… you can measure it in km or cm and it will start with 30% 1s"

I can do better: I et you, if I measure something in km and in cm (and i measure correctly) both measurements will always start with the same number: How surprising!

You could develop a rounding method based on Benford's law.

Because log(0.5) is approximately -3, If the least significant digit is 3 or less, round down. 4 or higher, round up. 4 is "logrithmically closer" to 10 than to 1. People used to multiply with slide rulers. It is really easy to see the logrithmic distribution of digits by comparing the log and regular scales.

That's almost exactly a perfect logarithmic scale. lol holy crap.

I know it's too late, but what formula did you use to calculate the interest? I used

Y=1(1+(.1/365))^(365X)

and got these numbers

1, 1.10, 1.22, 1.34, 1.49, 1.64, etc…

from 0 to 5

2:34 – "It doesn't matter what unit you use. You want to measure […] in kilometers […] centimeters […] it will still be the same." Anyone should deeply expect that the most significant digit for anything given in different METRIC units be the same for the same things …

+Marcus Anderson

i am reproducing here – a comment (slightly edited) made earlier by one Marcus Anderson

"+Wowmaxy yes computers expose the slight fallacy in all this. More correctly, in computers, numbers starting with 0 occur 50% of the time, whereas in publishing – no numbers >= 1 start with zero – ever ! That is the missing part of this story. When you do digit analysis of a page of numbers – you must pad all the small numbers with zeros – to make them all the same length. Now you will see the true distribution, and find that only 10% of all n-digit numbers (for a fixed value of n) actually start with 1, as expected. The party trick here is that, in publishing, leading zeros are omitted for brevity, albeit perhaps incorrectly. Engineers recognise this issue and standardise engineering notation with a 3 digit mantissa and an exponent in powers of 3. (powers of 3 ???) Thus numbers should be published in a standardised engineering notation style (eg $0.03K = $30 and 0.16Mm = 160Km (etc))."

i think Marcus Anderson may have a valid point – as to why the first digit – and only the first digit – of CERTAIN (could-be-better-defined) data sets – show a probability distribution skewed toward the logarithmic – rather than evenly distributed – among the digits 1 thru 9.

perhaps Mr Grimes might consider getting back to us on the merits of Mr Anderson"s pleadings.

i think it might shed some more light on this still-rather-murky Benford Business.

what say -James ?

PINEAPPLES !!!

I FOUND YOUR CHANNEL!!

Actually it's not that surprising if you consider that most

~~numbers~~measurements are floating point numbers essentially. So we are talking about their leading digits we are not really talking about their magnitude but their significance. So we shift the numbers up and down and in binary the first digit will be always 1, so much so it can be ignored, and a 1 assumed to be there, except in the case of 0 and other special forms.Not really surprising give these number go up to a value therefore are very biased in their production. You have to show a big lot of something on shop display (the abundance theory of shop display- you show a lot and people buy a lot) but having a big number is costly for supply reasons- which is more likely they choose 12 or 30- you will choose 12 less than 30 but you can't less than 10 because then it doesn't look like abundance- only hard for mathematicians to understand.

Nature uses multiplication more often than it uses addition.

your bad

you just gained a VERY long overdue and VERY well-deserved subscription dude! Been watchin' your vids for years and plan to keep on doing so. It isn't every one who can explain such lofty concepts via such simple means. You're a rare kind of pimp…overall…unique…and i do mean pimp in the stud sense, and i do mean stud in the cool guy as opposed to the horse sense, as opposed to pimp in the sense that would equate to my recently having accused you of possessing and controlling and renting out and likely occasionally slapping a slew of hookers who like all people deserve happiness and basic human respect and dignity and sky and love and friendship and the chance to dance and so on should they desire to.

oh, so it's because every base has 0(all numbers have a leading 0, and base 0 only has one value, being 0), fewer have base 1, fewer base 2, so considering all numbers in all bases, larger numbers get rarer than small numbers.

…I suddenly realized that I NEED to go out to buy some Lottery tickets!… LOL!!!….

ohh snap, mind blown!

3 is everywhere but most times not in the beginning.

Its raining outside…

Of the first one hundred numbers in the fibonacci sequence exactly 30 start with a one.

What are the odds that SRV would be on the cover od the financial times?

Excellent lecture and demo.

BL seems to be in the same category as the inverse of scalar fields QM-TIME style, and results in a conglomeration of primes over-lapping by psudo-random quantities in a single connected quality of point(s)?

I'm not a Mathematician, just seriously interested in the context that includes Benfords Law, because it looks like the mathematical/temporal origin of chemical bonding characteristics?

Biological complexity, under these conditions, has no final limitation, because it's all about the universal Phi;- wave-package integration of time rates as relative logarithmic scales of the same nature as in the presentation.

___The cause-effect of QM-Time modulation Superspin-Superposition-singularity is the continuous, metastable consistency of here-now temporally in the spectrum of Eternity-now. BL is an identification of some Actuality structural relationships.

That is a true astronomer: 41.3% is roughly equal to 30.0%.

8:30 I think he forgot pink Gap 😛

I knew this instinctively, since a lot of numbers have an inverse log prevalence, and in a log scale graph, the area between 1.0 and 2.0 are bigger than the area between 2.0 end 3.0 and so on.

God I love your videos and enthusiasm James. Thanks for all you put into it.

You can hang some blankets off screen to soak up the echo/reverb in the audio 🙂

while watching this , i saw 19 upcoming videos out of which 6 videos were starting with 1 in terms of their duration that equals to about 31 % ,, it's quite amazing

So explain why keys of computer keyboard don't break down or vanish as the Benford's law stays?

looooool i was like why are you talking so fast? seems like i've been using youtube at 1.25 speed for the last day at least XD you are the first one to exceed realistic proportions at that speed haha

Could you please get MORE excited about math.

I've been studying lottery numbers to see if certain digits appear more often than others. One thing I've noticed is that the digit 4 is the least occurring digit in winning lottery numbers. I've kept track of the numbers for only three months, but my theory is still proving to be very strong.

I stopped watching once I learned the following section is for serious mathematicians only.

Gotem.

I remember learning about this in my accounting classes, but not my math classes. It's kind of interesting where you come across certain things.

Didn't realize I was going to learn how to get away with money laundering when I started clicking on math videos today.

Why does at 9:04 the gap between 1 and 2 (3 Blocks) has the same length as the gap between 2 and 4? Shouldn´t it be half the length (1,5 Blocks from 1 to 2)? And the gap between 4 and 10 is 4 blocks long but shouldn´t it be 6 blocks long instead? I really don´t get it. I would be very glad if someone could explain it to me.

IS there an easy proof?

From Wikipedia . . . "There are illustrative examples and explanations that cover many of the cases where Benford's law applies, though there are many other cases where Benford's law applies that resist a simple explanation."

Does "resist a simple explanation" mean "don't have an explanation?

You better stop abusing that paper pal.

In binary it's every number except 0. :O

Coincidence?

Ratios near unity are going to start with ‘1’ . Particularly if you filter out any numbers that start with ‘0’!

Logarithms are similar… if you put everything into floating point format.

Also left-padded numbers

00001. Throw it out

00002. Throw it out

… …

09998. Throw it out

10000. The only real number!

This didn't age well

Sounds like a cousin of Zipf's Law. Both aspects of the same underlying law?

I would venture the guess that (2:45) indeed the results for "meters", "kilometers", and "centimeters" would be very similar when counting starting numbers.

Finally, a use for the common logarithm.

I built this into production control programs 20 years ago to catch people who were under-reporting gold wastage and pocketing the shortfall.

So what you're saying is that if you're going to be committing fraud you should remember benfords law

My eyes were hurting watching you NOT blink… How much coffee do you drink???

3:20 "that one guy in China" those dam Chinese and their dammed 3 inches.

You look so happy and excited at the fade-in at 3:26! 😀

But when I know that forensics looks out for that distribution then I can counteract it easily.

Wow, school failed to teach me how log really works, and you explain me it in 10 minutes and I understand o.o

views: 411k

dislikes: 98

likes: 7.5k

comments: 803

not quite (although, to be fair)

subs:

1_99kI looked at the number of views for all the recommended videos to the right of my screen. (Your results may vary). 12 out of the 39 videos had a number of views that started with the number "1".

Nice

greetings Singingbannana..i have a question for you..in speaking of the Triple Alpha Process, i think that astrophysicists have been fudging their numbers in order to make the process appear to account for the carbon in the universe, in order to prop up their naturalist theory..what are your thoughts on this?..thank you in advance!…a big fan!

U need moor subscribers!

I watched the part at the end, does that mean I'm a serious mathematician now?