Application Center - Maplesoft

Contact Maplesoft Request Quote

Submit your work

Application Center Applications First Digit Simulation Preview

App Preview:

First Digit Simulation

You can switch back to the summary page by clicking here.

Learn about Maple

Download Application

FIRST DIGIT SIMULATION

Author: Roland Engdahl
University of Kalmar
Sweden
mr.engdahl@telia.com

Background

'First Digit' is the first digit, not equal zero, in a number. i e the first significant digit. e g in 0.0003457 the first digit is 3.

HistoryNewcomb noticed in 1881 that in logarithm tables the first pages, containing numbers starting with 1, were more dirty than later pages.Benford rediscovered the properties in 1938. He formulated a law for the occurence of first digits:Benfords Law: The probability that the first digit is k is log(1+1/k) ( log(a) = There are two aspects of Benfords law:1. Application to numerical data2. Numerical calculation

Application to numerical data

On the WEB we can get many applications to numerical data e g physical constants,
stock market, rivers area, population, Am. league, addresses, squareroots, death rate, newspapers.

A more interesting application of Benfords principle is fraud detection in business accounts.

Numerical calculation

Before 1600 it was hard to do the product of several factors, even worse to divide.

John Napier (1550-1617) invented logarithms. In tables you could find the logarithm for a number.

To multiply numbers you had to add the logarithms and to divide you had to subtract the logarithms.

At last you had to go the other way in the tables, finding the 'antilogarithm'.

This method was common in schools and applications until the seventies.

An important facility was the slide rule. The scale is logarithmic, so the distance between 1 and 2 is the same as the distance between 2 and 4 etc

To multiply is to add distances on the slide rule, to divide is to subtract distances on the slide rule.

After some multiplications or divisions of numbers taken at random the probability to end betwwen 1 and 2 is about 30 percent (log(2)).

If you are the owner of a slide rule compare the distance between 1 and 2 with the distance between 1 and 10.

Simulation with random numbers

Quotation from Donald E. Knuth, The Art of Computer Programming vol 2 p 173

The most prudent policy for a person to follow is to run each Monte Carlo program at least twice
using quite different sources of random numbers, before taking the answers of the program seriously;
this not only will give an indication of the stability of the results,

it also will guard against the danger of trusting in a generator with hidden deficiencies.

(Every random number generator will fail in at least one application.)

Donald E. Knuth. The Art of Computer Programming, vol 2. Ch 3 Random Numbers Ch 4.2.4
Theory about leading digits p 238-

You can select different random number generators.

OBSERVE: The result of calling rand() depends of the version of Maple:

Early version of Maple is here Maple versions up to and including 9.5 Eversion

Later version of Maple is here Maple from version 10 Lversion

Simulation with Built-in generators

In RandomTools[Linear Congruence] the generator is of the linear congruence type

x[k+1] = m*x[k] mod p

p = 999999999989 is the greatest prime with 12 digits, m = 427419669081 multiplier

For short we call this generator for single

In Eversion when calling rand() you get version RandomTools[Linear Congruence] i e single generator

In Lversion when calling rand() you get version RandomTools[MersenneTwister]{GenerateInteger]

This generator is not of LinearCongruence type

RandomTools[MersenneTwister][NewGenerator] is applicable in Lversion and gives the same result as rand()

Simulation with triple-generator

The generator triple consists of three generators of linear congruence type in parallel for 'smoothing out'

In Eversion

Simulation in Built-in generator: single and rand are the same procedure: NewGenerator not applicable

Simulation in triple-generator: same result as Lversion (LinearCongruence)

Simulatiom:Another approach... not applicable

In Lversion

Simulation in Built-in generator: single and rand are quite different procedures:

Simulation in triple-generator: same result as Eversion (LinearCongruence)

Simulatiom:Another approach: applicable RandomTools[MersenneTwister][NewGenerator] very fast

We can compare the results from the simulations with different generators. See the results at Comments and results.

If you take only one factor (nrf = 1) in the product the relative frequency of first digits will be (almost) equal for all digits 1..9.

This is an indication of that the random number generator passes the frequency test..

Let the random number generator produce a great many numbers. The computer calculates the product of these random numbers.

At last we look at the first digit (first nonzero digit) in the product. We study the distribution of these first digits.

Choose the number of factors in nrf

Choose the number of trials in trial

Simulation with Built-in generators

The single generator

parameters for random number generator single
single is of the same as the built-in generator in LinearCongruence

>	m:=427419669081: p:=999999999989: x:=32564:

>	single:=proc() local s: global x,m,p: x:=x*m mod p: s:=x/p: evalf(frac(s)); end proc;

$proc () local s; global x, m, p; `:=`(x, `mod`(`*`(x, `*`(m)), p)); `:=`(s, `/`(`*`(x), `*`(p))); evalf(frac(s)) end proc$

(5.1.1)

procedure prodrand gives the first digit with nrf factors in the product and the generator singel

>	prodrand:=proc( ) local i,pr; pr:=1; for i from 1 to nrf do pr:=prsingle(); end do; while pr < 1 do pr:=pr10 end do; trunc(pr) end proc;

(5.1.2)

Choose nrf : number of factors

>	nrf:=2:

Choose trial number of trials

>	trial := 10000:

>	for k from 1 to 9 do u[k]:=0 end:

>	for k from 1 to trial do s:=prodrand(); u[s]:=u[s]+1 end do:

>	print('i', relfreq); Digits:=5: for i from 1 to 9 do i,evalf(u[i]/trial) end do;










	(5.1.3)

The rand generator

pp module for built-in generator rand

>	pp:=10^12:

procedure prodrand gives the first digit with nrf factors in the product and
the built-in random number generator rand()

>	prodrand:=proc( ) local i,pr; pr:=1; for i from 1 to nrf do pr:=prrand()/pp; end do; while pr < 1 do pr:=pr10 end do; trunc(pr) end proc;

(5.2.1)

Start new experiment from here

Choose nrf : number of factors

>	nrf:=2:

Choose trial : number of trials

>	trial := 10000:

>	for k from 1 to 9 do u[k]:=0 end:

>	for k from 1 to trial do s:=prodrand(); u[s]:=u[s]+1 end do:

Record of distribution of first digit of the product

>	print('i',relfreq);Digits:=5: for i from 1 to 9 do i, evalf(u[i]/trial) end do;










	(5.2.2)

NewGenerator

>	restart: with(RandomTools[MersenneTwister]);

(5.3.1)

>	M:=NewGenerator(range=10^12);

(5.3.2)

>	prodrand:=proc( ) local i,pr; pr:=1; for i from 1 to nrf do pr:=prFloat(M(),-12); end do; while pr < 1 do pr:=pr10 end do; trunc(pr) end proc;

(5.3.3)

Start new experiment from here

Choose nrf : number of factors

>	nrf:=2:

Choose trial : number of trials

>	trial := 10000:

>	for k from 1 to 9 do u[k]:=0 end:

>	for k from 1 to trial do s:=prodrand(); u[s]:=u[s]+1 end do:

Record of distribution of first digit of the product

>	print('i',relfreq);Digits:=5: for i from 1 to 9 do i, evalf(u[i]/trial) end do;










	(5.3.4)

Simulation with triple generator

     Parameters for random number generator triple.
     Produced with numtheory : safeprime (p) and primroot (m)      triple is of the same type as the built-in generator singel in LinearCongruence
     Triple has three generators in parallel for smooting out for better performance in serial test

>	m[1]:=413654159661: p[1]:=876543220739: x[1]:=32564: m[2]:=319894150324: p[2]:=676543219907: x[2]:=536766: m[3]:=274222336502: p[3]:=555985626419: x[3]:=2321:

>	triple:=proc() local s,i: global x,m,p; s:=0: for i from 1 to 3 do x[i]:=x[i]*m[i] mod p[i]: s:=s+x[i]/p[i]: end do: evalf(frac(s)); end proc;

$proc () local s, i; global x, m, p; `:=`(s, 0); for i to 3 do `:=`(x[i], `mod`(`*`(x[i], `*`(m[i])), p[i])); `:=`(s, `+`(s, `/`(`*`(x[i]), `*`(p[i])))) end do; evalf(frac(s)) end proc$

(6.1)

procedure trust gives the first digit with generator triple and nrf factors

>	trust:=proc( ) local i,pr; pr:=1; for i from 1 to nrf do pr:=prtriple(); end do; while pr < 1 do pr:=pr10 end do; trunc(pr) end proc;

(6.2)

Choose nrf : number of factors

>	nrf:=2:

Choose trial number of trials

>	trial := 10000:

>	for k from 1 to 9 do u[k]:=0 end:

>	for k from 1 to trial do s:=trust(); u[s]:=u[s]+1 end do:

>	print('i', relfreq); Digits:=5: for i from 1 to 9 do i,evalf(u[i]/trial) end do;










	(6.3)

Comments and results from simulations

Comments on Random Number Generators

In what extent can we trust a pseudo-random generator?

In Birger Jansson, Random Number Generators (doctor's dissertation 1966)
and at Donald Knuth we find different kinds of test:

Frequency tests, Serial correlation, Poker tests, Gap tests, Run tests,
Coupon collector's test, The - test, Visual tets.

Summary of results
number of trials : 10 000 000

From the results we can expect that the error is at most 1 promille

The generators can be considered suitable for the purpose!

FD is First Digit

BIN is built-in generator rand() in Lversion triple is triple-generator

relative frequency

nrf FD BIN triple

2 FD =1 0.24142 0.24125

FD=2 0.18309 0.18313

3 FD=1 0.30057 0.30062

FD=2 0.18839 0.18817

4 FD=1 0.30731 0.30752

FD=2 0.17814 0.17839

5 FD=1 0.30276 0.30287

FD=2 0.17525 0.17518

6 FD=1 0.30081 0.30063

FD=2 0.17557 0.17556

Theory for nrf = 2

nrf2FD1 is probability for FD = 1

nrf2FD2 is probability for FD = 2

We leave to Maple do the work

>	restart:

nrf=2

1-2

>	a1:=int(2/x-1,x=1..2);

(7.1)

10-20

>	b1:=int(10-10/x,x=1..2)+int(20/x-10/x,x=2..10);

(7.2)

>	(nrf2FD1):=evalf((a1+b1)/81);

(7.3)

>	a2:=int(3/x-2/x,x=1..2)+int(3/x-1,x=2..3);

(7.4)

>	b2:=int(10-20/x,x=2..3)+int(30/x-20/x,x=3..10);

(7.5)

>	(nrf2FD2):=evalf((a2+b2)/81);

(7.6)

The correspondence with simulated and theoretical values are very good.

There are good reasons to trust the simulated values even for nrf > 2

It seems that P(FD=1) has a maximum for nrf = 4 and that P(FD=1) for nrf>2 lies near log(2)

Simulation: Principle Slide Rule

We study the numbers k * r where r is a given irrational number and k passes the natural numbers from 1 to trial c to convert from = exp(x) to

>	restart: r:=evalf(-sqrt(2)/2): s:=0: c:=evalf(ln(10)):

>	trial:=10000:

In sumlog we add the irrational number r to s. s is transformed to (0,1) . Let log(u) = r.
The effect of add r is the same as multiply with u. On the slide rule we add distances when doing multiplication.

>	sumlog:=proc() global s,r,c; s:=s+r; while s<0 do s:=s+1 end do; trunc(exp(s*c)) end proc;

(8.1)

>	for i from 1 to 9 do u[i]:=0 end do:

>	for k from 1 to trial do t:=sumlog(): u[t]:=u[t]+1 end do:

>	Digits:=5: print(digit,relfreq); for k from 1 to 9 do k, evalf(u[k]/trial) end do;










	(8.2)

Benfords law

table of log(i+1)-log(i) for i = 1..9

>	Digits:=5:print('i','log10(1+1/i)'); for i from 1 to 9 do i,evalf(log10(1+1/i)) end do;










	(9.1)

Results and comments

slide rule principle Benfords law

FD = 1 0.30103 0.30103

FD = 2 0.17609 0.17609

The agreement between simulation and Benfords law is not surprising.

We study the numbers k*r where r is an irrational number and k passes the the natural numbers

It is a known fact the numbers frac(k*r) lie densely and the distribution is uniform on the interval (0,1)

Let log (u) = r. In the following we assume that is reduced to the interval (0,1)

Let F(x) be the probability that < x. F(x) = P ()

This implies

log() < log(x) and this is equivalent with frac(k*r) < log(x)

So we have F(x) = log(x)

First digit in frac(k*r)) ≤ j implies frac(k*r) < j+1. This occurs with the probability log(j+1)

P(First Digit = h) = log(h+1)-log(h) = log(1+1/h)

Slide rule :

To multiply with u repeatedly is the same as add distance r repeatedly. If necessary go 'back to' interval (1,10).

We note that the distribution is dense and uniform

How often do we end in the interval (1,2) on the slide rule?.

The relative frequency must be the ratio on the slide rule between distance (1-2) and distance (1-10) i e log(2).

So we may realize that the relative frequency for first digit is equal to 1 is log(2)

Simulation: Another approach

When multiply with a random number the products are accumulated. In this program we only study FD=1

>	restart: with(RandomTools[MersenneTwister]);

(11.1)

>	M:=NewGenerator(range=10^10);

(11.2)

     In maple we multiply pr with a random number in the interval (0.01 - 0.1) (In Float(M(),-11)
     On the slide rule we start with 1 and multiply repeatedly with random numbers (uniformly distributed)      and read the first digit for every new random number.
     For every new serie we start with pr=1

>	maple:=proc() global pr; pr:=prFloat(M(),-11); while pr<1 do pr:=pr10 end do; trunc(pr) end proc;

(11.3)

Start new experiment from here

Choose number of trials for each serie

>	trial:=10000: serie :=3:

>	for i from 1 to serie do u[i]:=0: pr:=1:for k from 1 to trial do t:=maple(): if t=1 then u[i]:=u[i]+1 end if end do: end do: Digits:=5: print(relfreq); for k from 1 to serie do evalf(u[k]/trial) end do; print(average); su:=0: for k from 1 to serie do su :=su + u[k]: end do: evalf(su/trial/serie);






	(11.4)

COMMENTS From the results we can see the variations in the series of simulations. It seems that the probability for FD=1 is near log(2) as in the other simulations. Theory may be rather complex. Do your own experiments with several values for trial and serie This simulation resembles SIMULATION Principle slide Rule There we added a constant term. Here we perform the simulation by multiplying with a new factor at haphazard every time on the slide rule: How often will the result end up between 1 and 2 on the slide rule? If you wish to write comments to me by e-mail,
you must specify the subject to 'First Digit' (e-mail with other subjects are deleted)

Legal Notice: The copyright for this application is owned by the author(s). Neither Maplesoft nor the author are responsible for any errors contained within and are not liable for any damages resulting from the use of this material. This application is intended for non-commercial, non-profit use only. Contact the author for permission if you wish to use this application in for-profit activities.

Maple

Maple Add-Ons

Student Success Platform

Improving Retention Rates

MapleSim

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

App Preview:

First Digit Simulation

Maple

Powerful math software that is easy to use

Maple Add-Ons

Student Success Platform

Improving Retention Rates

MapleSim

Advanced System Level Modeling

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

App Preview:

First Digit Simulation