## Monday, 9 February 2015

### Two Balls, One Urn, revisited

When I wrote the last post on the "Two Balls, One Urn" scenario, Lokee had a look, pointed out some typos and said "It's too confusing!"  I tried to explain a little more simply, but it might still be a little unclear.

Fortunately, as the "creator" I am not obliged to maintain precisely the same scenario to get at the point so, let me suggest a slightly different scenario:
I have an enormous barrel in a pitch black room and I know that in the barrel there are two million balls, only two of which are entirely white, the rest being of various colours, shades and and patterns.  I take my urn into the pitch black room and, completely at random, I take out two balls from the barrel and place them in the urn.   Because it is so dark in the pitch black room, I cannot see either of the balls when I do this.
Then, while still in the pitch black room, I reach into the urn and, completely at random, I draw out one ball.  Because it is so dark in the pitch black room, I cannot see either of the balls when I do this.
Finally I walk out of the pitch black room and I look at the ball I drew out of the urn.  It is white.
What is the probability that the second ball - the one still in the urn - is white?

1. 1 / 3,999,999

1. Oops. Should be a 7 on the end.

2. An interesting effort, Travis. I actually think that your original 1 / 3,999,999 figure makes more intuitive sense than 1 / 3,999,997. The figure 1 / 3,999,998 could also make some sense, although it would require some stretching. How did you arrive at this figure that ends in 7?

Sadly enough though, all three figures are wrong, as is the one that would make most sense to some others - a much smaller figure of 1 / 3,999,998,000,000 (the probability of picking out two white balls from the barrel) - and the somewhat larger figure of 1 / 1,999,999 (using the logic that of the combined 1,999,999 balls remaining in both the barrel and the urn, only one is white - this is actually the probability of picking a second ball from the barrel after having removed the first one).

3. Here's how I arrived at my answer. Imagine that the two white balls are numbered 1 and 2 to distinguish them. There are 1,999,999 unique pairs that include ball 1, including the pairing with ball 2. There are then 1,999,998 unique pairs with ball 2, excluding the pairing with ball 1 - that's already in the set of possible outcomes, so it isn't unique. This means there are 3,999,997 possible pairings with one white ball. If you pulled a white ball out of the urn then you have one of those pairings, but only one of those pairings includes a 2nd white ball, thus 1/3,999,997. I'm not convinced this is wrong. Care to explain why it is?

4. That should have said "there are 3,999,997 possible pairings with AT LEAST one white ball".

5. And that explains the problem. I only calculated the probability given knowledge that there is at least one white ball in the urn, not the probability given the knowledge that a white was drawn first from the urn. So my new answer is 1/1,999,994.

6. Ah, ok, I understand your logic now. I think part of the problem is that if you are going to number the two white balls, and consider unique pairs, you might need to number all the balls and consider ALL pairings - not just the pairings with a white ball.

The total number of unique pairings is 2,000,000! - this is not an exclamation mark, it's a factorial sign meaning the number 2,000,000 x 1,999,999 x 1,999,998 x 1,999,997 .... all the way down to 1. This is an absolutely huge number! (<- that is an exclamation mark)

To give you a vague idea how massively huge this number is, if you use your computer's calculator (assuming that you are not using a supercomputer) to calculate the value, it will cause a stack overflow - and it is likely to start doing this just under 3248! (the value of which is 1.97x10^9997). I don't think we want to consider numbers that are this huge!

If we wanted to consider potential pairings, we only need to consider pairings in which at least one ball was white - which makes it 1 / 1,999,999. But this isn't the correct answer. As pointed out before, this is the probability of selecting the second white ball out of the barrel, having already removed the first one.

2. You've made me think hard to figure out where I did something wrong but the only thing I came up with was my mistake of calculating 3,999,998 / 2 = 1,999,994 in my previous comment. It was late and I was trying to do everything in my head, so I'll allow myself that one.

Regardless, your statement about the factorial is incorrect. n choose 2 is (n(n-1))/2, not n!, and n choose k is n!/(k!(n-k)!). But that doesn't even matter. Given a vat of 2 million balls containing 2 white balls, there are 3,999,997 possible pairings with at least one white ball. Given that a white ball was drawn from the urn, the pairing in the urn must be one of these 3,999,997 pairings. In other words, after we had drawn two balls and knew nothing about them, there were 3,999,997 different pairings that could have led to us seeing a white ball on our first draw from the urn. Given that this happened, the probability space is constrained to those 3,999,997 pairings. Now, take all of those possible pairings and consider all the possible sequences in which they can be drawn from the urn. The pairing with the 2 white balls has two sequences in which a white ball is drawn first (1 then 2, or 2 then 1). Every other pairing has only one sequence in which the white is drawn first. So there are 3,999,998 sequences where a white ball is drawn first and two of these occur when the pairing is for both whites, so given that we drew the white ball first the odds of the pairing being the one with both whites is 2/3,999,998 = 1/1,999,999. I would not be surprised if I did something wrong, especially given the two errors I already made, but I'm just not seeing it. Please explain the fault in what I just laid out.

1. Yes, my factorial statement was incorrect. It was bugging me as I drove home, but even then I just thought I had got it wrong with what we should apply the factorial to (there are n-1 potential partners for any single ball). You are right that if we consider pairings then we need to be thinking about combinations or permutations, depending on whether or not the order matters.

Having had some more time to think about it (ie my original factorial debacle and your comments), I think you are right that there are 3,999,997 potential pairings with at least one white ball, of which one pairing is with another white ball. But 1/3,999,997 isn't right answer.

The 1,999,999 figure is the most alluring - it feels right - however, this is the probability, having drawing out a white ball from the barrel, of drawing out a second white ball. When the important thing happens though, this event has already passed by ... we're really talking about the balls in terms of the urn, not the barrel (hence the title of the article). I'll leave it at that point for the moment, because I think you are almost there and I don't want to steal your thunder.

2. I'm sticking with my answer.

3. Ok, that's a fully acceptable position after having thought carefully about it and presented some arguments that appear quite sound. I want to wait a bit longer before I give the answer (my answer, at least) to see if anyone gets it. When I provide an answer we can have a lively argument as to who is right :)

4. Here's something extra to include in the mix...

A = Drew two white balls
B = The first ball I see is white after I draw two balls

Bayes theorem says that P(A|B) = P(B|A) x P(A) / P(B)

P(A): The number of possible pairs is the combination 2e6 choose 2 = 2e6! / (2! x (2e6 - 2)!) = 2e6 x (2e6 - 1) / 2 = 1.999999e12. Within that, one pair contains two whites, so P(A) = 1/1.999999e12.

P(B): The number of possible two balls sequences is the number of permutations of 2e6 of size 2 = 2e6!/(2e6 - 2)! = 2e6 x (2e6 - 1) = 3.999998e12. Within this, 3,999,998 start with a white ball, so P(B) = 3,999,998/3.99998e12 = 1/1e6.

P(B|A): If I have two white balls, then I am certain to get one on the first draw, so P(B|A) = 1.

Plug it all in,
P(A|B) = 1 x (1/1.999999e12) / (1/1e6) = 1e6 / 1.999999e12 = 1/1999999

5. I just noticed that I implied that 1/1,999,999 was wrong! That would mean I was arguing against Bayes' Theorem - a very brave thing to, don't you think? I should have said that the pairing argument isn't quite the right way to go.

I also wanted to see if anyone came up with the answer 1/2 (either it is white or it is not white). There is an argument to support this notion, but I think that you are right with Bayes' - so long as we are provided the necessary information.

I still think that the pairing idea is potentially confusing. Part of what I wanted to get at was that the likelihood of picking two white balls from the barrel was remote (close to 1 x 10^-12) but once you have one white ball out, the likelihood of having selected two is significantly greater. The other part is that if I had no information about the selection process, I could only presume that the likelihood was 1/2 (because the likelihood of selecting a white ball is 100% if there are two and 50% if there is only one). This leads to the bizarre situation that I actually started my thinking with:

If I have a barrel with one million balls in, only one of which is black, and my friend (who knows nothing of the distribution of balls in the barrel) selects balls at random until 999,999 are removed and they have all been white - what is the probability that the last one is black? Of course, I know that the probability is 100% For my friend, however, the probability is far lower - what are the chances of selecting 999,999 white balls in a row from a barrel of one million balls of which one is black?

Well, that's 1 in one million. However, my friend knows nothing about the distribution of balls in the barrel, and the colour of balls that are not white is not limited to black (unless Henry Ford was producing them ...) Perhaps my friend knows that there are a million different colours and patterns of balls, and that they are equally likely. Then my friend is going to have to assess the likelihood as 1 in one million million (formerly known as a billion, now usually as a trillion or, possibly, a thousand billion).

This actually represents the likelihood that 1) I selected at random a black ball from the range of balls available and 2) my friend selected at random the 999,999 white balls before reaching the last, non-white ball. What I and my friend are doing is looking at the same problem from different sides of an information firewall.

In terms of many of the articles here, the question is: given that we exist here on this planet, what is the likelihood that we should exist here on this planet? My argument would be that, given that we are in possession of the key fact (our existence is the last remaining ball in the barrel), the likelihood is 100% - and we don't have any ignorant friends who need to work out whether we exist or not! Note that this 100% cannot be diminished in any way irrespective of how many different scenarios there might be in which we might not have existed. (And yes, I know that this is a sledgehammer approach to a rather minor issue - sadly though, many people struggle with the probability of large numbers.)

6. I don't know whether this is a coy attempt at avoiding a mea culpa, but I'll give you the benefit of the doubt. If that isn't what this is then you certainly had me confused.

I think that the pairing explanation is both illuminating and valid. Note that P(A) in the Bayesian calculation is found through "n choose 2", which is the same as the pairing I outlined, and P(B) is found through permutations, which is the same as the sequencing that I used to fix my original mistake. Not to mention the fact that I was able to do it all in my head (with the acknowledged errors). I'm almost certain that I couldn't have run the Bayes calculation in my head and so I didn't even try.

Regarding the bizarre situation, it sounds a lot like an attempt to put Descartes' "I think, therefore I am" into probabilistic terms but I'm not sure whether its an accurate translation, nor whether it's a useful way to explain something. Have I completely missed the point again?

7. Would you say then that, given all the facts, the chance of any event happening that in fact occurred was 100%? I've thought about this before as well. How would you square this up with the Copenhagen interpretation of quantum physics (or any interpretation that allows for true randomness or unpredictability)? Would you say that, given the fact of a wave function collapse, there was a 100% of that occurring?

-B

8. Travis R, I certainly made a mistake in my wording and about the factorials (and I also had the numbers out by one while trying to salvage the factorial idea), so there are plenty of mea culpas to go around. I think we humans have difficulties dealing with very large numbers and I certainly don't consider myself immune to that. It's not really about cogito ergo sum, but about the Anthropic Argument and the Fine-Tuning Argument. The first suggests that we are here no matter how unlikely that might have been and if we weren't here there would be no us to be amazed at how unlikely it would be for us to be here if it weren't for the fact that we are. The second suggests that the range of conditions in which life might exist and evolution might eventually lead to us humans is so narrow that it is nigh on impossible for humans to have arisen as no more than a coincidence (to which a theist bolts on an appeal to ignorance and adds his or her god of choice and a non-theist might suggest a multiverse, an eternal but cyclical universe or an underlying rule set that makes the conditions of our universe necessary rather than contingent).

I think we can arrive at the 1,999,999 figure in our heads without Bayes and without pairs. We have one white ball, the other ball has to be one of the remaining 1,999,999 balls, one of which is white, therefore there's a likelihood of 1 out of 1,999,999 that the other one in the urn is white. It's really that simple - the other figures you arrived at were purely due to you thinking of pairs, which brings in considerations of combinations and permutations. When we go through the formal process using the Bayes' Theorem and we arrive at the same figure that was in our heads, then we know we got it right (however we managed to arrive at it).

B.

Post facto, the likelihood of any event having happened, given that it happened, is 100%. Ante facto (before the fact), we might have no idea what the likelihood of the event happening is. That said, these are epistemological questions, questions about what we might know and the uncertainty of our knowledge. It could be that the "real" likelihood of any event happening (a likelihood of which we have absolutely no idea) is 100% - because the universe might be entirely deterministic and everything that has happened from the first fluctuation that led to the Big Bang might have been unavoidable, it's just a long, hugely complicated chain of cause and effect. This is not, however, the Copenhagen interpretation but then the Copenhagen interpretation isn't the only interpretation, there are other interpretations that have everything being entirely deterministic at a "layer or reality" that we don't have access to (due to "hidden variables"). Personally, I am more sympathetic to a hyper-deterministic interpretation, in which if we could only see and understand everything and could avoid the problems associated with all the processing required (both in terms of time and complexity), then we would be able to track the entire development of the universe from before the Big Bang and into the future, perhaps to infinity, perhaps until the end of this cycle and perhaps to a final death of the universe.

So, in brief, yes, I do personally think that it is more likely that if a wave function collapses, then there was a cause to that collapse even if we don't have access to that cause and the likelihood of that collapse is probably 100% (although we are on the other side of the metaphorical information firewall - so to us, the likelihood is somewhat lower).

9. Yes, from what you were saying it sounded like you might be a hardcore determinist. I tend to sympathize with that view. And I wasn't trying to imply that the Copenhagen interpretation of quantum physics was the only one; however it seems to be what most people accept, so I was just wondering your views on that. I am currently reading a book by the cosmologist Lee Smolin which strongly criticizes the deterministic view of the universe. It's called the Fabric of the Cosmos, and he basically argues that the laws of physics are not set and that the idea of determinism flows from Plato to Newton and errs in describing a perfect, mathematical, static universe. He prefers a universe that is much more dynamic (he thinks we should look at it more biologically than in terms of physics). Anyway, interesting read if you ever have the time.

I do have to ask you and Travis how you think the 1 / 1,999,999 figure is correct regarding the thought experiment. It seems to me that this is only right if you assume that the revealed white ball was the first ball drawn. If you drew the revealed white ball on your second draw, the chances that you drew a white ball on your first draw were quite a bit better, at 1 / 1,000,000. That is how I reached the 1 / 1,499,999.5 result. Thoughts?

-B

10. B,

In my scenario, it doesn't matter which ball was selected first from the barrel, because they are both put into the urn and effectively randomised, since one ball is selected from the urn completely at random. We don't know which was taken out of the barrel first, the one in my hand or the one still in the urn. It would appear that it does not in fact matter which was selected first.

If it did, then we would be in the bizarre situation of having a different likelihood arising if I selected a ball from the urn and put it in my pocket so that I could not see it and then looked at the ball still in the urn - even though effectively all I was doing was swapping a hidden ball in an urn with a hidden ball in my pocket.

I wonder whether Smolin criticises determinism at the same level that I am considering - there are certainly problems with the perfection of "macro-determinism", for example with weather prediction, but this is because we simply don't have all the data and because a huge number of tiny localised events all play their part with the influence of those events interacting with each other and either attenuating or magnifying over time (not all butterflies cause tornadoes). Perfect macro-determinism does have its faults. That said, Smolin does have some rather edgy theories, so he could be saying that determinism at the micro scale is also problematic even if he might not hold with the Copenhagen interpretation (I am not sure whether he does or not).

11. Consider this:

You draw a ball from the barrel and without looking at it, mark it with a 1. You then place this ball into the urn. You draw a second ball from the barrel and without looking at it, mark it with a 2. You then put this ball in the urn. If you draw a white 1, then you know that the chances that you drew a white 2 were only 1 / 1,999,999, as on your second draw there was only one white ball still in the barrel. However, if you draw a white 2, then you know that the chances of there being a white 1 was 2 / 2,000,000 (1 / 1,000,000), as there were 2 white balls in the barrel when you made your first draw. However, we are not privy to this information, and so we must take the average of these two probabilities.

"If it did, then we would be in the bizarre situation of having a different likelihood arising if I selected a ball from the urn and put it in my pocket so that I could not see it and then looked at the ball still in the urn - even though effectively all I was doing was swapping a hidden ball in an urn with a hidden ball in my pocket."

I'm not sure that I follow this, unless you meant to say that if you selected a ball from the barrel and put it in your pocket, and then selected a ball from the barrel and put it in the urn, looking into the urn would change the probabilities, which it would for the same reason as I outline above. If I'm off, would you mind explaining this to me? Thanks

I think that Smolin's criticism of determinism is different than yours. From what I can tell, he argues that even with all of the information and processing power needed, the future of the universe cannot be known. He thinks that the universe is a self-organizing system with feedback loops, constantly shifting and building upon itself in a way that cannot be calculated (at least in terms of following set mathematically-describable laws). He thinks these physical law are simply shorthand approximations that are useful yet bound to become inaccurate at a certain degree of precision. Note that this is at the deepest level (ontologically), not the epistemic level. And yes, he has a book or two specifically dedicated to quantum physics, but I have not yet read those, so I don't know his views on the matter.

-B

12. In retrospect I'm seeing how this turned into an interested psychology experiment. You're right that there is actually a very simple way to arrive at 1/1,999,999, but I was convinced by the hints toward a non-intuitive answer that I need to think carefully and reject the apparent easy answers. I would not even be surprised if this subconsciously influenced my math error where I reduced 2/3,999,998 to 1/1,999,994 instead of 1/1,999,999. That's all beside the point but its interested to look back and how my intuitive sense that the solution was non-intuitive led to avoidance and rejection of the simple solution. I had to work through several calculations before I was convinced that it actually was correct. So I'll give you the benefit of the doubt. Re-reading the comments I can see how this happened and how I was also at fault for reading too much into the language. Thanks for the exercise.

Regarding the anthropic \ fine-tuning question, I still don't totally follow the analogy, but I thought you might be interested in a post I interacted on not long ago which is reminiscent of what you were trying to do here.
https://anaivethinker.wordpress.com/2014/10/25/house-of-probability-a-puzzle/

13. Travis,

Thank you for getting involved :) I'll take a look at that other puzzle as well.

B,

We don't actually care which of the white balls we have. What we do know is that when we have a white ball in our hand, there is only one white ball left out of the potential 1,999,999 balls that were in the barrel. The 1 in 1,000,000 chance that we would select a white ball (any white ball) is gone and dealt with when we look at a white ball in our hand. The likelihood of having selected a white ball is now 100%.

How about we forget about white and coloured balls and think of 2,000,000 uniquely numbered balls. We follow the same procedure and find that we have ball #2,000,000 in our hand. What is the likelihood that the other ball in the urn is a specific number ball, say #1? It has to be 1 in 1,999,999, because we know that it can't be #2,000,000 - so it has to be one of the balls numbered 1 through 1,999,999. The same applies if we pick out any numbered ball and think of the likelihood of the other ball having any other specific number. And the same applies if we are thinking about an unnumbered ball that is white and about the single remaining white ball from a selection of 1,999,999 balls.

I'm not that comfortable with "the universe (as) a self-organizing system with feedback loops". It sounds far too much like a designed artefact for my liking. Perhaps it's just a fancy way to talk about emergence and equilibria within complex systems, both which I am quite comfortable with. I'll keep an eye out for his books at the hopelessly inadequate bookshops in my area.

14. I looked at that other puzzle and my comment is awaiting moderation. In the meantime, here it is:

This is quite an interesting argument, but … I think that you have problems when trying to apply it to the Earth. The major problem you have could be highlighted by considering the two selection scenarios – random and favoured. You are suggesting that there is no indication whatsoever as to what sort of selection was made, but if we were to compare that to the question of whether the Earth is random or favoured, then your scenario should include a wealth of hints as to the method of selection – just as there are hints on this planet that indicate that we are either here as part of a natural (undirected) process or as part of a divine plan. We just need to think very carefully about the hints to see if there is something conclusive one way or the other.

We should also be looking at those hints before trying to assess the likelihood of our existence. I note that we could make predictions based on the divine plan theory, but I suspect that these predictions would often fail and this won’t upset the predictor since he or she will simply put that down to not really knowing the divine plan.

And that, I suppose, is another problem. The divine plan, if there is one, is so sprawling and unpredictable as to be indistinguishable from undirected developments. Perhaps there is actually some cosmically obscure divine plan, but we will never know it in this lifetime, and the existence of such a divine plan is totally useless for the purpose for which it is normally employed – proving the existence of a god. The best we can really say is that we don’t know whether there is a divine plan or not (and consequently we don’t know whether there is some sort of god or not), even if I personally suspect not.

15. Neopolitan,

Thanks for the response. I am still convinced by my reasoning, so I will try to address your points. You say that as soon as we see a white ball revealed in our hand, the 1 / 1,000,000 chance is gone (as it collapses to a 100% probability). This is where I beg to differ. It could just as well be that the 1 / 1,999,999 chance is gone, while the 1 / 1,000,000 chance survived. That was my whole point with why the draw order actually matters, and since we do not know that order, why we must take the average. Ask yourself this: if you are given the information that on your second draw, you drew a white ball, what are the chances that you drew a white ball on your first draw? I think that you should find that the answer is 1 / 1,000,000.*

The result I just reached, 1/1,999,999.5, applies with equal force to your proposed hypothetical. The chances of drawing chosen number 1 on the first draw are 1 / 2M. The chances of drawing that same chosen number on draw 2 are 1 / 1,999,999. So knowing only that you picked chosen number 2, the chance of you having picked chosen number 1 is the average of the chances that it was picked either on the first draw or on the second draw. This comes to 1 / 1,999,999.5.

As for Smolin, no he is not trying to make an artifact of the universe. In fact, he thinks that if the universe is self organizing, there is no need for an organizer. He thinks that those who propose a Platonic, Newtonian universe have more need for a creator, fine tuner, law giver, etc.

-B

16. An interesting psychological fact is that we tend find out own explanations far more convincing than those of others. So one of us might be hanging onto an explanation because of a feeling of ownership :) I do have the Bayes Theorem behind me, so I am quite confident. The challenge, therefore, is to highlight the error in your thinking in a way that doesn't trigger your ownership issues. A very positive thing is that you have identified flaws yourself and have been able to make a move based on those flaws. Not everyone is so open to change.

I think that there is a misunderstanding about the scenario in which the barrel contains 2 million numbered balls. I did say that they are uniquely numbered, so once you have removed a ball, there is no other ball with the same number. What we know is that once a ball is removed, there are (2M-1) balls left in the barrel, so the likelihood of the the second ball being selected (a priori), no matter what number it has, is 1 in (2M-1).

Let me come at it from a slightly different angle: say that we start by turning the lights on in the pitch black room and you search through the barrel of multicoloured for one of the two white balls. You deliberately take out the white ball that you eventually find and give it to me. I put it in my pocket. Then you leave to turn out the lights and I stir up the balls in the barrel before removing one ball, completely at random, and putting it in the urn.

What is the likelihood that the ball in the urn is white?

Or say that we do it this way. I go into the pitch black room with the lights out and I select one ball, completely at random, and I place it in the urn. Then you turn on the lights and enter the room. You go to the barrel and keep taking balls out of the barrel, one by one, until you find a white ball. You give me the white ball, which I put into my pocket, and you return all the other balls to the barrel.

What is the likelihood that the ball in the urn is white?

In both instances, the likelihood that the ball in the urn is white is 1 in 1,999,999 - because we know that the ball in the urn is one of the 1,999,999 balls that are NOT in my pocket, only one of which is white.

17. "I think that there is a misunderstanding about the scenario in which the barrel contains 2 million numbered balls. I did say that they are uniquely numbered, so once you have removed a ball, there is no other ball with the same number. What we know is that once a ball is removed, there are (2M-1) balls left in the barrel, so the likelihood of the the second ball being selected (a priori), no matter what number it has, is 1 in (2M-1)."

Yes but in your example, it was not stated that the ball with the number 2M was drawn first. If this was known, then I would definitely agree with you that the chances of drawing any specific number on the second draw would be 1/1,999,999. This simply follows from the fact that there are 2M - 1 balls left. But wouldn't you admit that on the first draw, the chance of getting a ball with the other number on it was 1/2M?

"Let me come at it from a slightly different angle: say that we start by turning the lights on in the pitch black room and you search through the barrel of multicoloured for one of the two white balls. You deliberately take out the white ball that you eventually find and give it to me. I put it in my pocket. Then you leave to turn out the lights and I stir up the balls in the barrel before removing one ball, completely at random, and putting it in the urn.

What is the likelihood that the ball in the urn is white?"

Yes, the chance here again is of course 1 / 1,999,999. But this is also different than the proposed hypothetical because again, here we have knowledge that a white ball was drawn *on the first draw.* This is significant information.

"Or say that we do it this way. I go into the pitch black room with the lights out and I select one ball, completely at random, and I place it in the urn. Then you turn on the lights and enter the room. You go to the barrel and keep taking balls out of the barrel, one by one, until you find a white ball. You give me the white ball, which I put into my pocket, and you return all the other balls to the barrel.

What is the likelihood that the ball in the urn is white?"

Here, the chance is 2 / 2M (1 / 1M). I'm glad you brought up these hypotheticals, because they did get to the heart of the issue. When you drew from the barrel, there were 2 white balls in it out of 2 million balls. My finding a white ball in the barrel and giving it to you doesn't add any new information and doesn't change the original odds you had when you drew. So while we do know that the ball in the urn must be only one out of the 1,999,999 balls not in your pocket, the chances of you having drawn it were quite a bit higher than this number.

Consider a barrel of 3 balls. 2 are white. You first randomly draw a ball and put it in the urn. I then remove a white ball from the barrel. Would you really say that the chances of the ball in the urn being white are 1/2? I would say the chances are surely higher, at 2/3. I think this is actually a rather testable scenario, if it comes to that ;).

-B

18. PART ONE:

First I must say that this is a fascinating discussion - even if that might just be the uber-nerd in me speaking! Rarely do people take the time to both think things through and defend their position rationally.

I agree wholeheartedly that the likelihood of selecting the first numbered ball, irrespective of what number it actually was, given that I picked it completely at random, was 1 in 2M. However, once I've selected that ball, the "epistemological probability density function" collapses and the likelihood that I HAVE selected that ball is 100%. And therefore the likelihood of selecting a particular ball from the remaining balls is 1 in (2M-1), on which we now seem to have agreement. It's possible that your agreement wavers in your head, this is perfectly normal (and I have the same issue as well). It's probably due to our different thinking processes fighting it out, per Daniel Kahneman's "Thinking Fast and Slow". Our quick thinking has us easily believing whatever we thought of first, while our slow thinking eventually arrives at the right answer but at a cost of cognitive load.

"Here, the chance is 2 / 2M (1 / 1M). I'm glad you brought up these hypotheticals, because they did get to the heart of the issue. When you drew from the barrel, there were 2 white balls in it out of 2 million balls. My finding a white ball in the barrel and giving it to you doesn't add any new information and doesn't change the original odds you had when you drew. So while we do know that the ball in the urn must be only one out of the 1,999,999 balls not in your pocket, the chances of you having drawn it were quite a bit higher than this number.

"Consider a barrel of 3 balls. 2 are white. You first randomly draw a ball and put it in the urn. I then remove a white ball from the barrel. Would you really say that the chances of the ball in the urn being white are 1/2? I would say the chances are surely higher, at 2/3. I think this is actually a rather testable scenario, if it comes to that ;)"

The smaller barrel is a good way of thinking, remove the cognitive load of large numbers and look only at the principles. What you arrive at though is ... something that looks like a variation of the Monty Hall problem. And people often get the Monty Hall problem wrong! I spent weeks arguing it with someone on an internet chat channel. Each time we finished, he seemed to be coming over to the right answer, then he'd go away, his quick thinking would reassert itself and the next time we met, he would be arguing for the wrong answer again. I don't think he was ever truly convinced, I think he just gave up. From the sound of it, you would get the right answer for the Monty Hall problem, but in this case, the situation is a little different.

Let's put it terms that are equivalent to the Monty Hall problem, and hopefully you can see that the scenario looks similar but it isn't:

There are three doors, you choose two doors. Monty Hall opens one of the doors that you selected, revealing a goat, and then asks you whether you want to switch.

And the Monty Hall problem:

There are three doors, you choose one door. Monty Hall opens one of the doors that you didn't select, revealing a goat, and then asks you whether you want to switch.

In the latter, the likelihood of getting the car with a switch is 2/3. In the former, the likelihood of getting a car with a switch is 1/2 (ie 1 in (N-1), where N is the number of items selected from).

19. PART TWO:

Working this through:

With the Monty Hall problem the likelihood that you selected the right door is 1/3 and the likelihood that you selected the wrong door is 2/3. After Monty reveals a goat, you are twice as likely to get a car with a switch.

With the modified version, the likelihood that you selected the right door is 2/3 and the likelihood that you selected the wrong door is 1/3. The effect of Monty revealing a goat is to remove the number of doors available to you - but always a door hiding a goat (but a goat which you already knew that you had - so you aren't really getting any more new information). There remains a 1/3 likelihood that the door you still have is hiding a goat - exactly the SAME likelihood associated with the door you didn't select, so there is a 50-50 chance of benefiting from a switch - meaning that the likelihood of either door hiding a goat is 1/2.

Now, I am certain that this is right, but I am not absolutely and totally convinced that the reason that I am certain of this is because I've properly worked out the figures, or for the same reason that most people think that the answer with the Monty Hall problem is 50-50, rather than 66.7-33.3.

I'm going to have to leave it for the moment and ponder things to see if there is a good way to show that my reasoning is correct.

20. I've pondered it - a couple of hours of severe cognitive dissonance! I've posed the problem formally, as the Reverse Monty Hall problem in a later article. Before I did so, however, I wrote the solution - because it kept slipping away from me.

21. Hey Neo,

I'm glad that you are enjoying the discussion. I am too. And I am still not so sure that you are correct!

Anyway, I see that you have analogized the three-ball scenario with a reverse Monty Hall scenario. I had not even thought of that until you brought it up. That being said, I do not think that we need to go "reverse Monty" to make use of the analogy. Consider: There are 3 doors each with a ball behind it (i.e., 3 balls in the barrel), 2 have a white ball behind them and one has a black ball. You pick a door (a ball from the barrel then placed into the urn), and I then open a door with a white ball behind it (I remove a white ball from the barrel). The question is, if you want the black ball, should you switch? I do realize that our original question is one of wanting 2 white balls, so simply reverse the ratios; I don't think a full reverse Monty scenario is needed.

The answer is that you should switch. 2/3 times you will have picked a white ball on the initial draw (You had only a 1/3 chance of getting black on your first pick), so 2/3 times you will benefit by switching. Reversing this, if you want the white ball, 2/3 times you will have picked it on your first draw, so you should not switch. It follows from this that the 2/3 chance of white, not the 1/2, is correct.

Extrapolating out to the 2M ball scenario, as stated here:

"Or say that we do it this way. I go into the pitch black room with the lights out and I select one ball, completely at random, and I place it in the urn. Then you turn on the lights and enter the room. You go to the barrel and keep taking balls out of the barrel, one by one, until you find a white ball. You give me the white ball, which I put into my pocket, and you return all the other balls to the barrel. What is the likelihood that the ball in the urn is white?"

Applying the Monty principle, and assuming that you want a black ball, if I reveal to you a white ball, you will be better off switching balls, as the odds of you picking a white ball on your first draw were 1 / 1M, but the odds of you picking a white ball after the reveal are only 1 / 1,999,999. It is only a slight increase to your odds of getting a black ball: 99.9999% (with reveal) v. 99.99995% (initial draw), but it still increases your odds. Seeing the flip side of this, then, is that your odds of picking a white ball on your first draw were higher than they would be on your second pick, once I have revealed the other white ball.

As an aside, having read over your reverse Monty scenario, I agree with the majority that you should stay with your initial selection, as 2/3 of the time you will be correct. You should not switch, as you will only win 1/3 of the time by switching.

-B

22. Would you mind terribly if I put all my eggs in one basket (what, no, another container!?!? ... why can't I just put all my balls in one barrel?) and showed that in the reverse Monty Hall problem the answer is 1/2 (= 1 / (N-1), where N=3). If I can show that, would that proof then filter back into this vexed question allowing us to agree that the likelihood of the ball in the urn is white is 1/1,999,999 (1 / (N-1), where N=2 million)?

23. To be quite honest with you, I am having trouble drawing a perfect analogy between the reverse Monty and the barrel situation. I found it easier to think of the typical Monty Hall scenario. I will put it this way: If you can show me how the reverse Mont problem is a 1:1 analogy of the barrel situation (or at least a relevant part of it), and if you can show that the answer is 1/2 in the reverse Monty situation, then I will have to agree with you!

-B

24. That's fair enough. I'll put paid to the Reverse Monty and then try to put something together to either show that these scenarios are analogous or to show that the probability of the ball in the urn being white is 1 / 1,999,999.

25. Ok,

I have managed to simulate the ball draw in python. As it turns out, begrudgingly, you are right about the probability of the remaining ball being white. Given 2 white balls, the probability that the ball remaining in the urn being white given that the ball revealed ball is white is, as you said, 1/n-1 where n = total balls. My two approaches are both to high or too low. Yours is always closest to the test. Here is a print out of the results. I ran it with 10 balls in the barrel, 2 of them being white. Under the Anticipated section is just what you and I predicted the results to be. Here are the results:

Given a white ball revealed, % of times other ball was white = 11.008%

Anticipated:
B: ORIGINAL RESPONSE
B: (White/Total + White-1/Total-1) / 2
B: (2/10 + 2-1/10-1) / 2
B: (2/10 + 1/9) / 2
B: (20.0% + 11.111%) / 2
B: 31.111% / 2
B: 15.556%

B: REVISED RESPONSE
B: (White-1/Total + White-1/Total-1) / 2
B: (2-1/10 + 2-1/10-1) / 2
B: (1/10 + 1/9) / 2
B: (10.0% + 11.111%) / 2
B: 21.111% / 2
B: 10.556%

Neo: White-1/Total-1
Neo: 2-1/10-1
Neo: 1/9
Neo: 11.111%

Stats:
Total balls = 10. White balls = 2 (2/10 or 20.0%).
Remaining ball white = 2187 (11.008%). Remaining ball black = 17680 (88.992%)
Total Iterations = 100000. White ball revealed = 19867 (19.867%)

So, the results don't lie. But I am still trying to wrap my head around it. Oh, and don't let this get to your head, I still beleive that you are wrong about the Reverse Monty. Also, when you proposed:

"Or say that we do it this way. I go into the pitch black room with the lights out and I select one ball, completely at random, and I place it in the urn. Then you turn on the lights and enter the room. You go to the barrel and keep taking balls out of the barrel, one by one, until you find a white ball. You give me the white ball, which I put into my pocket, and you return all the other balls to the barrel. What is the likelihood that the ball in the urn is white?"

I am still convinced that the result of this is 2/n.

-B

26. Thanks B. I very much appreciate your effort in working it through and I will try my utmost to not let it go to my head.

Did you see my response to someone in which I worded the Reverse Monty Hall problem so that I would almost exactly match the Two Balls, One Urn scenario? - oddly enough though, I can't find it now.

Can you run the scenario with n=3, just for kicks?

Then all I need to do is convince you that this scenario is essentially the same as the Reverse Monty Hall Problem :)

After that, the world ...

As to the other variation, that was just an attempt to convince you of the 1/(n-1) answer. I'm pretty sure that 1/(n-1) is right, but at this stage it doesn't matter. Its purpose has been served by your python coding.

27. Sure thing:

Given a white ball revealed, % of times other ball was white = 50.371%

Anticipated:
B: ORIGINAL RESPONSE
B: (White/Total + White-1/Total-1) / 2
B: (2/3 + 2-1/3-1) / 2
B: (2/3 + 1/2) / 2
B: (66.667% + 50.0%) / 2
B: 116.667% / 2
B: 58.333%

B: REVISED RESPONSE
B: (White-1/Total + White-1/Total-1) / 2
B: (2-1/3 + 2-1/3-1) / 2
B: (1/3 + 1/2) / 2
B: (33.333% + 50.0%) / 2
B: 83.333% / 2
B: 41.667%

Neo: White-1/Total-1
Neo: 2-1/3-1
Neo: 1/2
Neo: 50.0%

Stats:
Total balls = 3. White balls = 2 (2/3 or 66.667%).
Remaining ball white = 33646 (50.371%). Remaining ball black = 33151 (49.629%)
Total Iterations = 100000. White ball revealed = 66797 (66.797%)

And no, I missed that post, but if you find it let me know.

- B

28. Thanks again B, I found it. I thought it was at reddit, but no, it was one of the multitude of people called "anonymous" who have shown an interest - another polite one, so I don't really mind.

It's here -> http://neophilosophical.blogspot.com/2015/02/the-reverse-monty-hall-problem.html?showComment=1424353136062#c8313291746556958190

Please let me know if that convinces you. I think it should, but perhaps I am missing some subtlety :) I promise, yet again, to not let it swell my head. According to quite a few people recently, my head is already dangerously swollen.

29. I looked it over. It seems to me that the Monty Ball (mind if I call it that?) scenario is different than the Monty Hall. I know you have been told this before. The key difference is that you are throwing out the cases where the car (black ball) is in your hand. While you are right that this scenario could not occur in your reverse Monty Hall, what would actually happen, to parallel the reverse Monty Hall, would be a switching of the black ball with the white ball that is in the urn.

I know that this is why you are saying that the key to the scenario is that 'the door is already open.' In other words, you are trying to model the reverse Monty Hall in such a way that each iteration is only one of the instances where the car is not behind door x (x being your hand). The problem with this is that, while the scenario you are modeling Can arise in the game, so can its flip side (the car being in your hand, in which case the host will reveal the goat (white ball) in the urn. So you have artificially eliminated this scenario.

In effect, what you are defending is a reverse Monty hall scenario where the car is behind door number y or z only. You are throwing out all cases where the car is behind door number x. As such,what you are left with is a 50% chance that the car will be behind door y, and a 50% chance that the car will be behind door z (as you have tossed all the scenarios where the car was behind door x). So this being the case, you are correct in saying that a switch will only benefit you 50% of the time.

Now again, I realize that this is why you keep speaking of the probability 'once the door has been opened.' To this extent, you are somewhat correct. If you knew at the beginning of the scenario that there was a door (door x) that the car could not be behind, and then treat the host's opening the door as revealing to you which door was the door the car could not be behind, then your 50% probability is right. In a very loose sense, you might argue that this is a priori knowledge (knowledge you have before the experiment): The contestant does indeed know that the host is going to reveal a door, and that the car cannot be behind this door. However, this knowledge is different than the knowledge that there is a door that the car cannot be behind. Therefore, when the host opens a door, you are not gaining the knowledge that "the door the host opened was the door that the car could not be behind" (but if you were, then switching would have a 50% success rate). Instead, you are gaining no new useful information. The information you are getting is that "the car was not behind the door the host opened." But you knew that this would happen before the experiment.

So, the reverse Monty Hall is the same as the Monty Ball only to the extent that in the reverse Monty, you are artificially throwing out 1/3 of the cases (cases where the car or black ball is behind door x [your hand]).

-B

30. Thanks B, this is quite a nice little summary and it might take time to work out where the problem lies - I am tempted to say up front that it's because you think that I am *artificially* throwing out cases.

I think I have a way of explaining it, but would appreciate your opinion on it before I publish it. Could you please email me at the email address that you can access by clicking on the dark red "neopolitan" just above this post and I will share a link to a pdf file with the draft article. If you'd prefer to remain entirely anonymous, you can always set up a disposable email address at one of the many free email providers. Thanks.

31. Sure, I sent you my address (to the e-mail listed on your contact info).

-B

3. I think that the probability is a bit different than has been so far stated. The chance that you draw a white ball on your first draw is 2 / 2,000,000. The chance that you draw a white ball on your second draw is 2 / 1,999,999. Therefore, the chance of drawing two white balls is 2 / 3,999,998,000,000 or 1 / 1,999,999,000,000. But, we know that you indeed drew a white ball. However, we do not know if this was on your first or second draw.

Assuming that the revealed white ball was drawn on your first draw, the probability that the second white ball was drawn will be 1 / 1,999,999.
Assuming that the revealed white ball was drawn on your second draw, the probability of the first ball being drawn white is 2 / 2,000,000 or 1 / 1,000,000.

So, taking the average of these probabilities, the chance that the remaining ball is white is 1 / 1,499,999.5.

-B

Feel free to comment, but play nicely!

Sadly, the unremitting attention of a spambot means you may have to verify your humanity.