I want to simulate the birthday paradox in java. For some reason my output(probability) keeps getting very close to 1 e.g. simulation(10)->0,9268. In start you can see the probabilities my simulations should be close to. I have been looking for a mistake in my code for quite a while now so i hope any of you might be able to help me. I've looked up other codes of the birthday paradox but none of them seem able to help me with my strange output. p.s. you can ignore the //TODO, ill fix that once i get the code up and running. Thanks is advance!
static final int DAYS_IN_YEAR = 365;
static final int NUMBER_OF_SIMULATIONS = 500;
LCG random;
HashSet<Integer> birthdaySet = new HashSet<Integer>();
BirthdayProblem2(){
birthdaySet.clear();
random = new LCG(); //random numbers between 0 and 1
}
int generateBirthday(){ //generates random birthday
return (int) Math.round(Math.random()*DAYS_IN_YEAR); //TODO use LGC
}
double iteration(int numberOfStudents){ //one iteration from the simulation
int sameBirthdays = 0;
for (int i = 0; i < numberOfStudents; i++){
int bd = generateBirthday();
if (birthdaySet.contains(bd)){
sameBirthdays++;
} else {
birthdaySet.add(bd);
}
}
return (double) sameBirthdays/numberOfStudents; //probability of two students having the same birthday
}
void simulation(int numberOfStudents){
double probabilitySum = 0;
for (int i = 0; i < NUMBER_OF_SIMULATIONS; i++){
probabilitySum += iteration(numberOfStudents);
}
System.out.printf("For n=%d -> P=%.4f \n", numberOfStudents, probabilitySum/NUMBER_OF_SIMULATIONS);
}
private void start() {
simulation(10); //should be about 0.1
simulation(20); //should be about 0.4
simulation(23); //should be about 0.5
simulation(35); //should be about 0.8
}
public static void main(String[] argv) {
new BirthdayProblem2().start();
}
}
You are not clearing the birthdaySet
between iterations. Change it from a field to a local variable will prevent errors like that, because why would you need it as a field in the first place?
On a different note, generateBirthday
should use Random.nextInt(DAYS_IN_YEAR)
. An instance of Random
is a prime candidate as field. What is that LCG random;
line anyway?
UPDATE
Method iteration()
should return boolean, for whether or not there are students with same birthday. Number of true values returned divided by number of calls equals probability.
Since there is a relative small number of days in a year, a boolean array will be faster than a Set
, so here is updated code:
private static final int DAYS_IN_YEAR = 365;
private static final int NUMBER_OF_SIMULATIONS = 500;
private Random rnd = new Random();
private int generateBirthday() {
return this.rnd.nextInt(DAYS_IN_YEAR);
}
private boolean iteration(int numberOfStudents) { //one iteration from the simulation
boolean[] present = new boolean[DAYS_IN_YEAR];
for (int i = 0; i < numberOfStudents; i++) {
int birthday = generateBirthday();
if (present[birthday])
return true;
present[birthday] = true;
}
return false;
}
void simulation(int numberOfStudents){
int count = 0;
for (int i = 0; i < NUMBER_OF_SIMULATIONS; i++)
if (iteration(numberOfStudents))
count++;
System.out.printf("For n=%d -> P=%.4f%n", numberOfStudents, (double)count/NUMBER_OF_SIMULATIONS);
}
Sample Output
For n=10 -> P=0.1340
For n=20 -> P=0.4120
For n=23 -> P=0.5080
For n=35 -> P=0.8160
For n=10 -> P=0.1200
For n=20 -> P=0.4120
For n=23 -> P=0.5260
For n=35 -> P=0.8140
For n=10 -> P=0.1260
For n=20 -> P=0.3920
For n=23 -> P=0.5080
For n=35 -> P=0.7920
Increasing NUMBER_OF_SIMULATIONS
to 5_000_000
:
For n=10 -> P=0.1167
For n=20 -> P=0.4113
For n=23 -> P=0.5075
For n=35 -> P=0.8145
For n=10 -> P=0.1168
For n=20 -> P=0.4113
For n=23 -> P=0.5072
For n=35 -> P=0.8142