Friday, August 24, 2007

Testing the math solution

I couldn't post this as a comment in response to the comment posted on my previous post because of the very limited formatting allowed in comments. Frank posted in response that the expected value should be (1-X)/X, which provides some results we both agreed seemed counter-intuitive -- though in the way that these kinds of probability things often are. (In fact, this problem is kind of related to the birthday paradox, now that I think about it.) So I whipped up a little C program to test it by brute force.

Here's the program:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <conio.h>
#include <time.h>

/* Macro to get a random integer within a specified range */
#define getrandom( min, max ) ((rand() % (int)(((max)+1) - (min))) + (min))

int main() {
int probability, count, result;
long total;

/* Seed the random number generator with current time. */
srand( (unsigned)time( NULL ) );

for (probability = 50; probability >= 1; probability--) {
total = 0L;
for (count = 1; count < 1000; count++) {
result = 0;
while (getrandom(1,100) > probability) result++;
total += result;
}
printf("%2d%% calculated=%6.2f simulated=%6.2f\n", probability, (100 - (float) probability) / (float) probability, (float) total / (float) count);
}
}
And here are the results:
50%  calculated=  1.00  simulated=  0.97
49% calculated= 1.04 simulated= 1.07
48% calculated= 1.08 simulated= 1.09
47% calculated= 1.13 simulated= 1.09
46% calculated= 1.17 simulated= 1.19
45% calculated= 1.22 simulated= 1.22
44% calculated= 1.27 simulated= 1.22
43% calculated= 1.33 simulated= 1.37
42% calculated= 1.38 simulated= 1.41
41% calculated= 1.44 simulated= 1.54
40% calculated= 1.50 simulated= 1.52
39% calculated= 1.56 simulated= 1.67
38% calculated= 1.63 simulated= 1.64
37% calculated= 1.70 simulated= 1.66
36% calculated= 1.78 simulated= 1.71
35% calculated= 1.86 simulated= 1.97
34% calculated= 1.94 simulated= 1.89
33% calculated= 2.03 simulated= 1.95
32% calculated= 2.13 simulated= 2.20
31% calculated= 2.23 simulated= 2.30
30% calculated= 2.33 simulated= 2.33
29% calculated= 2.45 simulated= 2.41
28% calculated= 2.57 simulated= 2.55
27% calculated= 2.70 simulated= 2.78
26% calculated= 2.85 simulated= 2.81
25% calculated= 3.00 simulated= 2.91
24% calculated= 3.17 simulated= 3.17
23% calculated= 3.35 simulated= 3.38
22% calculated= 3.55 simulated= 3.73
21% calculated= 3.76 simulated= 3.77
20% calculated= 4.00 simulated= 4.07
19% calculated= 4.26 simulated= 4.19
18% calculated= 4.56 simulated= 4.70
17% calculated= 4.88 simulated= 4.93
16% calculated= 5.25 simulated= 5.38
15% calculated= 5.67 simulated= 5.52
14% calculated= 6.14 simulated= 6.11
13% calculated= 6.69 simulated= 6.66
12% calculated= 7.33 simulated= 7.29
11% calculated= 8.09 simulated= 8.41
10% calculated= 9.00 simulated= 9.20
9% calculated= 10.11 simulated= 9.97
8% calculated= 11.50 simulated= 11.09
7% calculated= 13.29 simulated= 12.98
6% calculated= 15.67 simulated= 15.47
5% calculated= 19.00 simulated= 18.58
4% calculated= 24.00 simulated= 24.61
3% calculated= 32.33 simulated= 33.43
2% calculated= 49.00 simulated= 47.77
1% calculated= 99.00 simulated=101.59
Close enough that I'd say the difference is likely due to not doing enough trials (I did 1000 for each probability) and inaccuracies in the floating point math and random number generation I used.

No comments: