First, as a preface to this post, I should say it is a not-entirely-irrelevant-fact that I have two children, one of whom is a boy born on a Sunday.
There is a discussion on the letters page of the current (3rd July) New Scientist, debating an issue from an article in the 29th May edition (Mathemagical by Alex Bellos). In summary, Bellos report Gary Foshee saying:
I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?"We are then told that the answer is 13/27 (close to a half), but that if he'd not specified the Tuesday - if he'd just said:
I have two children. One is a boy. What is the probability I have two boys?"the answer would have been 1/3.
The article concludes:
It seems remarkable that the probability of having two boys changes from 1/3 to 13/27 when the birth day of one boy is stated – yet it does, and it's quite a generous difference at that. In fact, if you repeat the question but specify a trait rarer than 1/7 (the chance of being born on a Tuesday), the closer the probability will approach 1/2.Now there are lots if counter-intuitive things that emerge from probabilities (see for example the Monty Hall problem) and I believe you can set this one up as a perfectly valid counter-intuitive problem as I shall discuss later, but in this case Bellos and the New Scientist have got it wrong.
The difficulty is that we are presented with Foshee volunteering the fact that he has a boy born on a Tuesday. The Bellos argument relies on everyone who has a boy born on a Tuesday telling you that fact. I too have a boy who was born on a Tuesday but I didn't tell you about that one, I told you about the one born on a Sunday. Consequently my case would not be included in the population used in Bellos's argument.
Imagine, if you will, a hall containing lots of parents. To be precise, there is one parent - say the father - from each of 1764 families with two children. (I have chosen 1764 deliberately, of course, so that all my numbers in what follows will be integers.) These 1764 families also happen to be exactly representative of the random distribution of gender and days of the week for their birthdays. (So exactly half the children are boys, and of those 1 in 7 is born on a Tuesday etc etc.)
From this hall, one father is going to come out and tell you the gender of one of his children and the day he/she was born.
Out comes Foshee, and says
I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?"
We know what Bellos's answer would be: 13/27. But I'm suggesting that he's forgotten to take account of the fact that when I come out I only tell you about the son of mine that was born on a Sunday, even though I, too, have a son who was born on Tuesday.
Lets develop the thought experiment a bit more.
First, announce to everyone in the hall that all those who don't have any sons should go home. That gets rid of all the families with two girls, which is a quarter of all families, because there are four equally-probably family structures: girl-girl, girl-boy, boy-girl, boy-boy. (Bellos and I agree on that.)
That takes away 1764/4 = 441 leaving 1323 families in the hall, all of whom have at least one son. In fact, the people in there now consist of 441 with two boys and 882 with one boy and one girl.
If any random parent comes out now, the probability that they have two sons in 1/3, again in agreement with Bellos.
Next, tell all the families who don't have any boys who were born on Tuesday to go home. The calculation of what that does to the numbers in the hall is a bit complicated, but I agree with the sums done by Bellos and the key point is that we lose a lot more of the ones who have only one boy than the ones who have two boys, because the ones with two boys have almost twice the chance that one of them was born on Tuesday. More specifically, of the 882 with only one boy, 6 out of every 7 depart, leaving 882/7 = 126. The calculation for the 441 with two boys is summarised below, but it ends up that 324 depart, leaving 117 (of whom 9 have both born on a Tuesday and 108 have one born on a Tuesday and the other born on a different day).
We are therefore left with 243 families, all of whom have at least one son born on a Tuesday. Of these, almost half (117/243 = 13/27) have two sons. This gets us to Bellos's answer. By restricting the people in the hall to those with a boy born on a Tuesday, the probability that any one of them has two sons is almost half. When we didn't specify the Tuesday, when just required that they have a son, the probability that any one of them had two sons was a third. Adding in the day of the week has had the surprising effect of increasing the probability that any one of them has two boys from a third to almost a half.
But this is not the case that Bellos has presented. He has not restricted the population in advance to those with boys born on Tuesday. He draws his conclusion from one person who happens to volunteer the fact.
Go back to the hall with all the families that have at least one boy. When anyone comes out of that hall, the probability that they have two sons was, as we saw, 1/3. Suppose that they all come out, one by one, and make a statement like Foshee's or mine. The same argument based on Tuesday can be made for any day of the week. So, for each one, on Bellos's reasoning, the probability that they have two sons is almost 1/2. And yet when they have all come out of the hall, you will find that only 1/3 of them had two sons. What's gone wrong?
It is a sort of double counting. Bellos's reasoning requires everyone who has a son born on Tuesday to tell you that. But since they are only telling you about one son, if they are doing that for Tuesday they can't all do it for Sunday as well. The point is you have to specify what day of the week you are using - which is what happened when we sent home everyone who didn't have a son born on Tuesday.
Coming back Bellos's conclusion
It seems remarkable that the probability of having two boys changes from 1/3 to 13/27 when the birth day of one boy is stated ...I don't believe that simply stating the birth does this. After all, everyone has some birth day, so if it did work in this way you wouldn't need actually to state the birth date. What makes the change is specifying the birth day.
Specifying the birth day excludes some families, making a real change. Just stating a birth day makes no change. Compare these two:
Foshee says: "I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?"
You can answer: "1/3"
Foshee says: "I have two children. One is a boy. What is the probability I have two boys?"
You say "Have you a boy that was born on Tuesday?"
Foshee says: "Yes"
You can answer: "About 1/2"
The difference between these two still seems pretty surprising, but not quite as bizarre as Bellos was claiming.
Acknowledgements: I would like to acknowledge the helpful conversations that I’ve had about this with my son, who was born on a Tuesday. He’s doing Further Maths at A-level and reckons this is typical of what he had to analyse in his stats and prob module
Appendix: Calculation details
441 have two sons.
The probability that both are born on a Tuesday is 1/7 x 1/7 = 1/49. So there are 441/49 = 9 for which both are born on a Tuesday
The probability that the first is born on Tuesday and the second some other day is 1/7 x 6/ 7 = 6/49. So there are 441 x 6/49 = 54 in this category
The probability that the second is born on Tuesday and the first some other day is 6/7 x 1/ 7 = 6/49. So there are 441 x 6/49 = 54 in this category
The probability that both are born on some day other than Tuesday is 6/7 x 6/7 = 36/49. So there are 441 x 36/29 = 324 with neither born on a Tuesday.