Wednesday, 7 October 2015

(My) Ignorance Behind "Marilyn Gets My Goat"

At the end of Marilyn Gets My Goat, I wrote:

As mentioned above, this argument is wrong. Precisely which bit is wrong is a little vexed. I originally thought the biggest issue nestles in Q10 and Q16. I still think it does, but many below argue that the issue is in Q15.  That might come down to a question of interpretation, but eventually I will put it into words, hopefully without sparking huge controversy this time.

I was initially quite happy with the argument that I had presented (especially in combination with a later clarification in Marilyn's Six Games).  Even now, looking at it with the knowledge that it is wrong somehow, I still find it faintly convincing.

I suspect that this has to do with ignorance.  Clearly I was overwhelmed with ignorance when I started off this merry chase, but it's not that sort of ignorance that I mean.  I mean more the type of ignorance that I mentioned in The Whole Reverse Monty Debacle:

… I had in my mind a scenario in which the volunteer (now transforming into a contestant) knew nothing, other than the nature of the revealed ball (now transforming into a goat).  During the transformation process, I forgot all about that ignorance and began trying to apply my thinking (in the presence of ignorance) to the Monty Hall Problem (in which there is less ignorance).

The problem, I believe, is associated with the doors.  The doors represent ignorance, because the contestant doesn't know what is behind the doors.  But clearly they don't consistently represent ignorance, because Monty and Holly apparently do know what is behind the doors.

To try to get at the heart of the problem, I want to go over the step-wise scenario again, after removing the doors and everything behind them.  Imagine instead that we are left only with Holly Mant and Marilyn, who is given a three-sided die (yes, they exist) and a coin.  On the table between them are three trays each with six tokens.


These trays represent the "expected value" of each of the doors that we've removed but will still think about as concepts.  We've assumed a random distribution of goats (M and A) and car (C), so there is a 1/3 chance of M behind each door, a 1/3 chance of A and a 1/3 chance of C.  This aligns with Marilyn's answer to Q1.

The arrangement of the tokens indicates the possible distributions of the car and goats, as given by Marilyn in response to Q2, and simultaneously show that the likelihood of each distribution is 1/6, as per the answer to Q3.

Marilyn is now encouraged to roll her three sided die (helpfully labelled NOT RED, NOT WHITE and NOT GREEN - as per the response to Q4).  Following the scenario in Marilyn Gets My Goat this would mean she rolls NOT WHITE and selects Red and Green.  The likelihood of selecting these two doors is 1/3, presuming that this is a fair die, and aligns with the answer to Q5.

So far so good.  Let's reorganise the trays:


Q6 is a gimme, since it's effectively the answer to "given X, what is the likelihood of X?"  And we can see the answers to Q7 and Q8 laid out in the selected trays - each distribution has a likelihood of 1/6.

Now it gets trickier.  Marilyn is asked what the likelihood is that the red door will be selected (which Holly won't do if there's a car behind it).  That can be represented like this:


The result is that there is a prior likelihood of 1/2 that the red door will be selected (1/12+1/6+1/6+1/12 = 6/12 = 1/2) - as per Marilyn's answer to Q9.

Marilyn is then asked Q10: "What is the likelihood that the Red door would have to be opened, in accordance with the rules, on the basis that that the car was behind the Green door?"

I now see that this question has a problem, which is reassuring on one level, because I did identify it as possibly harbouring an issue, and annoying on another, because I was trying to eliminate confusion and an ambivalent question like this will merely sow confusion.  There are two answers to the question, depending on your interpretation.

One interpretation leaves the question precisely as it is.  There are two scenarios in which the red door must be opened because there is a car behind the green door – AC and MC.  This means that the likelihood of the the red door being opened because of the location of the car is 1/3 (1/6+1/6 = 1/3).  There is also a 1/3 chance that the green door would be opened, because of the location of the car, a 1/3 chance that whichever door was opened was opened on the basis of a random selection.

This is not the interpretation that I (or indeed anyone, so far as I can tell) used.  We assumed that the question really meant this: “Given that the Red door was opened, what is the likelihood that the Red door had to be opened, in accordance with the rules, on the basis that that the car was behind the Green door?”

To work this out, we have to divide the likelihood that the car was behind the green door AND the red door was opened by the likelihood that the red door was opened, so (1/6+1/6) / (1/6+1/6+1/12+1/12) = (1/3) / (1/2) = 2/3.  Which was the answer given to Q10.

Note that this is also precisely the time when my ignorance (as described in The Whole Reverse Monty Debacle) began to really screw things up.  I had in my mind, at least at the very beginning, the idea that the contestant didn’t actually know the basis on which Holly would open the door.  Of course I should have known, since I was drawing parallels to the Monty Hall Problem, but I still had the vestiges of a random selection paradigm in my head.

So, just for interest’s sake, let’s look at a slightly different question: “If the door is opened by Holly Mant on the basis of a coin toss, what is the likelihood that the Red door will be opened?”

We have to go back to Q9 though, or rather Q9B, because the distribution of expected value is different:


The result, again, is that there is a prior likelihood of 1/2 that the red door will be selected, but with a different calculation (1/12+1/12+1/12+1/12+1/12+1/12 = 6/12 = 1/2).

In this scenario, with a random selection based on the toss of a fair coin, there's little point in asking "What is the likelihood that the Red door would have to be opened, in accordance with the rules, on the basis that that the car was behind the Green door?" – because the location of the car is not a factor in Holly’s decision.  Note: I am aware that this is no longer parallel to the Monty Hall Problem.

We can meaningfully ask what appears to be the final question: "Given that the Red door was opened, what is the likelihood that the car is behind the Green door?"   It's not quite the final question though, because we haven't yet specified what was behind the red door, we've only said that it was opened and we've not eliminated the possibility that the car would be revealed.  Given that, we can say that the answer to what becomes Q10B is given by the likelihood of the car being behind the green door divided by the likelihood of the red door being opened: (1/12+1/12) / (1/2) = 1/3.  Alternatively, you can look at the image and see that in 1/3 of the distributions, the car is behind the green door.  This is the same as the answer to Q10, as asked.

It's also the answer to a question about a prior likelihood: "What is the likelihood that the Red door will be opened AND the car is behind the Green door?"  Remember that we are now assuming that Holly opens the door on the toss of a coin.  There will still be the six possible outcomes illustrated, and in two of them the car is behind the green door.

The difference between these two scenarios is knowledge or ignorance on the part of Holly (or, more strictly, the ability of Holly to act on any knowledge she might have).  If she knows where the car is (and can act on that basis), she skews the outcome.  If she doesn't know, and opens a door randomly, she doesn't.  Note also that Q10, as asked, is a question about a prior likelihood, in other words a likelihood calculated in the presence of ignorance.  Therefore, it should be no surprise that the results align.

Let's move on.  The next step in Marilyn Gets My Goat was to reveal what was behind the red door - and it was Mary the Goat.

Q11 and Q12 were both gimmes, although Q12 might have been a bit confusing to someone who isn't great at English.

Once Mary has been revealed to be behind the red door, the trays could look like this:


This, which graphically represents the answer to Q13, is sort of what I had in mind when I was making my argument, especially in Marilyn's Six Games.  This was of course wrong.  What I had originally thought of could have more accurately been represented like this:


But the scenario I was actually discussing was pretty much this:


Q14 was merely harking back to Q6, with the same correct response.

The general consensus was that the answer to Q15 was where I truly messed up.  I'd agree if I had asked a slightly different question to the one I did: "How likely are each of these distributions now?"  The thing is, they actually are equally likely, but Mathematician and ChalkboardCowboy were leaping ahead to provide an answer to a completely different question (namely: "Given that Mary has been revealed to be behind the Red door, is it equally likely that the car is behind the green door as not?" to which the answer is no - I now agree with that answer, although I didn't at the time).

The subtlety that we were all failing to address/make clear is that yes, the two distributions are equally likely, but every single time the first distribution comes up Holly will open the red door to reveal Mary while she will reveal Mary only half of the time in the second distribution.

---

For the sake of clarity, I'll interpret the image above.  Once the red door is opened, revealing Mary (and revealing that this is a "Red Mary game"), there are two possible distributions - either the car is behind the green door, or Ava (the other goat) is.  These are the two games that Marilyn had to consider, and each of these games is actually equally likely (thus the first 1/2 used in the equations to the left and right).

However, in every instance of the first Red Mary game, Holly is obliged to reveal Mary (thus the 1/1 against Mary and the 0/1 against the car).  Therefore, the likelihood of a game in which the car is behind the green door AND Mary is revealed to be behind the red door (a Green Car Red Mary game) is 1/2*1/1 = 1/2.

In an instance of the second of these Red Mary games, Holly may open either the red door or the green door, and will toss a coin to choose which (thus the 1/2 each against Mary and Ava).  The likelihood of a game in which Ava is behind the green door AND Mary is revealed to be behind the red door (a Green Ava Red Mary game) is 1/2*1/2 = 1/4. 

In total, there is an "expected value" of 3/4 for the red door, when we are limited to Red Mary games - which is to say that if you run a sufficiently large number of Red Mary games, the red door will be opened in 3 out of every 4 instances.

The likelihood of the car being behind the green door, if the red door has been opened to reveal Mary is the expected value of Green Car Red Mary games divided by the combined expected value of both Red Mary games, so (1/2) / (3/4) = 2/3.  This is the standard answer, and the one I should have arrived at, but of course I didn't.

---

Now to the other question where I thought I had screwed up, Q16.  Oh my!

I did screw it up.

There's nothing better when trying to confuse yourself (and potentially others) than asking a question that contains a false assumption.  I asked "What about the likelihood that the producer would instruct Holly to open the Red door being 2/3?"

This relates back to Q9, which I've already addressed.  The prior likelihood that the red door would be opened was actually calculated to be 1/2, not 2/3.  The 2/3 result was arrived at in response to Q10, which asked a more specific question.

Then I answered a totally different question, effectively "Given that the Red door has been opened revealing Mary, what is the likelihood that Mary is behind the Red door and that the Red door has been opened?"  I gave the right answer to this question, 1/1.

Finally, I raced onto provide the answer that applied to a scenario that not even I knew I had in my head.

Let's look at that one more time before we leave this all behind us forever:


This represents a situation in which either Holly has no knowledge of what lies behind the closed doors or is unable to use that knowledge, or Marilyn has no idea as to what is going on (see The Whole Reverse Monty Debacle) and makes the assumption of a random selection.  In other words, an ignorance rich scenario.

In such a scenario, the likelihood of a car being behind the green door, given that the red door has been opened to reveal Mary is 1/2 ((1/4) / (1/2) = 1/2).

If I had figured all this out first time around, I would have saved myself and some other, surprisingly tolerant people a lot of nausea.
---
If I am still wrong, I am not sure that I want to know ...

No comments:

Post a Comment

Feel free to comment, but play nicely!

Sadly, the unremitting attention of a spambot means you may have to verify your humanity.