-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathdocumentation.txt
More file actions
133 lines (87 loc) · 3.9 KB
/
documentation.txt
File metadata and controls
133 lines (87 loc) · 3.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
Dr Keith Reid
Cailleach Computing Ltd
September 2021
Documentation for Haskell Poisson modeller
"Pwaskell"
POISSON DISTRIBUTION
Poisson distributions model the incidence of high numbers of independent random
incidents in a time (or space) series of epochs, or episodes, with a constant
probability. For example, in any 8 hour shift a doctor may receieve 20
phonecalls on average. It may 19, or 21, but 20 is most likely. 17 is less
likely and 0, and 40, are much less likely.
Haskell as an example of a functional language
Haskell is a functional language. Whereas Python and other object oriented
languages take "things" and "add bits on", and so are like lego, Haskell is like "marble run" or a domino run. The thought goes into a continuous process and
the relatively simple objects elements are not changed much. Object-orientated
furniture building we do something like
table = top_bit
table = table + legs
table = turn_upside_down(table)
table = table + placemats + food
But functional programming is more like
table = put_on_top((invert(add_legs(top_bit))), (place_mats, food))
MATHS:
The formula for the poisson distribution is hard to write in vim.
^ means exponentiation, "to the power of"
e means "eulers number", approximately 2.71828,
which is a fundamental constant like 1, 0, or Pi
x is the apparent number that your are estimating the probability of
L should be _lambda_ the expected or "average" number of events in a
measured period
p = probability of there being x events in a measured period
L^x * e^-L
p = _______
x!
EXAMPLE
Dugarte et al, cited in wikipedia, say there are 2.5 goals on average in a world
cup footy match, and that it's in a Poisson distribution.
The probability of 2 goals is:
6.25 * e^-2.5
p = __________________
2 (2! is 2*1 = 2)
p ~ 0.257
This estimate of one point is called a "probability mass function"; you can
also have a "cumulative" function which would ask about a range of values e.g.
"will there be at least 2 goals" which adds the probabilities of 0 and 1 goals.
CODE IMPLEMENTATION
Part A: Simulation
configure these:
L a float (meaning the computer can give it a decimal place) the expected number of incidents in a measured period
a an integer factor which is used to have a high number of possible
incidents - 20 will do see below - this not part of Poisson per se
m an integer the number of trials for your model
e Eulers number which we probably have to define as 2.71828...
x the apparent number of incidents that you are estimating probability of
Set up an array whihc will hold the results of trials, it will have m trials
Set up a repeated loop of m trials.
For each trial m
Create an array of random "element" numbers in the range [0-1.00)
Elements are a programming thing, not a Poisson thing
If they are over 0.05 they represent an incident.
If L is 8, have 20 * 8 = 160 elements in a block representing
the potentiality of a measured block. There will be about 8
that incide (is that a word?) per trial - call that the count
Count the incidents
Stick that into your results array
Go round
End up with an array like (6,7,8,8,8,7,6,8,9,9,7,8,6,8,7,8,8,8,7,9,4,8...
Count how many of each count there are
That's your distribution
I can't do graphs in Haskell yet. But I can make a data set that can be drawn
by Python. Or maybe a cheesy ASCII graph.
Part B: Idealised graph preparation
Use the formula to plot a graph for comparison.
Part C: Checking
Quantify the accuracy between the two probably using Chi Squared
Part D: Stress Testing
Mess with the assumptions of Poisson and see how it affects C
Documentation for Users:
Current state of code:
do
:l rand_ex.hs
then
take 3 (randomlist 2)
So that gives 3 from a lazily constructed infinite list of floats from seed 2.
The seed gives reliability between tests.
currently ghci wont parse defintions like a =2 so I have to do that in the
interpreter.