Thursday, 26 November 2015

Mea Culpa - Another Response to Mathematician

When responding to Mathematician in Triangular Circles (a little play on words, I know circles can't actually be triangular), I wrote this:

On parameterisation, I did some thinking about this along the lines of saying that if you have a 1/3 answer, then it seems (to me) that your selection method must simply have missed some of the chords.  In my way of thinking (standard caveat about the possibility of being wrong), if we are asked to select a chord "at random" then it follows that we would be selecting from a set of ALL chords, rather than from a specific subset, unless advised otherwise.  Thought from this perspective, our first concern is making sure that we have ALL chords available to select from.  The question then is how to express this properly.  I'm probably going to mess this up in some obscure way, but if you can at least try to understand what I am saying (and criticise the best formulation of my argument, rather than the worst), it would be appreciated.

I suggest that an expression for ALL chords in a circle defined by x2+y2=1 (in units of R where R is the radius of the circle) goes something like this:

The infinite set S of all unique sets Si of points that fulfil the following criteria:

S:

-1 > c > 1 (defining the y axis intercept of the chord)

0 > θ > 2π (defining the gradient of the chord)

         Si:

-√((-cosθ)2+(c-sinθ)2) > r > √((cosθ)2+(c+sinθ)2)

(x,y) = (r.cosθ,r.sinθ+c)

Note: the combined effect of these two conditions is (or is intended) to include all and only points between intercepts of the line defined by (x,y) = (r.cosθ,r.sinθ+c) and the circle defined by x2 + y2 = 1, thus defining a chord.  In other words a unique set Si is intended to define a unique chord.

When corrected in terms of mathematical terminology, etc, is this a parameterisation and, if so, does it establish or define a structure (per u/Vietoris) for which there is a defined probability measure (per u/Vietoris) or probability distribution (per u/overconvergent)?  And, if so, what Bertrand Paradox related answer would be expected from this parameterisation and associated probability measure/distribution?

---

Well, I was certainly right.  I did mess it up.

First, I've doubled the number of chords by using the intervals c:[-1,1] and θ:[0,2π] (hopefully this terminology is clear, it's slightly more convenient than using the -1 > c > 1 and 0 > θ > 2π structure.

I should have used either c:[0,1] and θ:[0,2π] OR c:[-1,1] and θ:[0,π].  Mathematician, in his response, went with the latter, so I'll use that to explain the second, more egregious stuff up.

Note that I said that "our first concern is making sure that we have ALL chords available to select from".  The whole purpose my sets was to achieve this and they don't.

For any value of θ<>0 (assuming that θ=0 is standard and aligns with the positive x-axis, the notional horizontal axis and that c is the point at which the resultant chord intersects the y-axis or notional vertical axis (I did use the word "intercept" before, which is apparently right in some cases but there may be some subtlety that I am missing - or perhaps my wording was just clumsy), there are chords are missed in my schema.

We've agreed that the interval [-1,1] is (or can be) uniform, so imagine 11 equally spaced points on the y-axis in that range and say we look at θ=π/4:


This cannot produce the set of ALL chords.  Additionally, there is a "skewing" of chords towards those that are longer, so it should come as no surprise that (as Mathematician intuited) there would be substantially more chords of length greater than √3.R.  For this reason, I don't think the following comment was nearly as silly as Mathematician later thought it was:

Ok, actually I'm not sure that I am computing the correct probability here. Tell me if this is your idea :

First you pick a number between -1 and 1, uniformly on the interval [-1,1]. And then you pick an angle between 0 and pi, uniformly on the interval [0,pi]. The chord corresponding to the couple (r,θ) is the unique chord that has slope θ and that cuts the horizontal axis at r. Is that okay ?

So this defines a probability on the set of chords. And with this, the probability that a random chord is longer than sqrt(3) is given by the following formula :

P= 1/3 + ln(7+4*sqrt(3))/2pi = 0.7525...

I might be wrong here, but it seems reasonable.

To Mathematician, in answer to the embedded question " … Is that okay?"  Yes, I am reasonably happy with that, once I get over my confusion about the use of r (which I normally think of as the length of a vector from (0,0) to some other point).  If forced to pick something similar, I'd have gone with (c,θ) since we already have c defined - this would be the unique chord with slope θ that is offset from the x-axis by c when x=0.  But I get what you mean,

I'll try to define a set of ALL chords again (this requires more than just a minor shuffle, I suspect).

An expression for ALL chords on a disc defined by x2+y2=1 (in units of R where R is the radius of the disc) goes something like this:

The infinite set S of all unique sets Si of all unique sets Sj of points that fulfil the following criteria:

S:

0 > θ > π (defining the gradient of the chord)

locus defined as (0,0)

Si:

-1/cosθ > c > 1/cosθ (defining the y-intercept of the chord)

Sj:

-√((-cosθ)2+(c-sinθ)2) > r > √((cosθ)2+(c+sinθ)2)

(x,y) = (r.cosθ,r.sinθ+c)

Note: the combined effect of these two conditions is (or is intended) to include all and only points on the intersection of the lines (x,y) = (r.cosθ,r.sinθ+c) and the disc defined by x2+y2=1, thus defining a chord.

Defining the locus as (0,0) removes some complications to the equations that would otherwise be required to achieve invariance in terms of translation (by which I mean movement of the circle to another location).  Setting the radius of the disc to R and making R the units of length in all considerations addresses the question of invariance in terms of scale.  Defining the set of y-intercepts such that c:[-1/cosθ,1/cosθ] goes only part of the way to addressing invariance in terms of rotation.

If we revisit the image above but extend out the range of c, we will get:


The gap has gone, but we've now got more chords at θ=π/2 than we had at θ=0, so we no longer have rotational invariance.  To get it back, we need to introduce a concept that probably has another proper term to it, but I call "granularity".

Say we select an arbitrarily large number (N+1) of evenly spaced samples over the interval from which we take c.  If c:[-1,1] because θ=0, then there would be N/2 samples above the locus and N/2 below the locus and one on the locus.  If we generalise this, for c:[-ci,ci], then there would be still N/2 samples above the locus and N/2 below the locus, but with a different separation - rather than the samples being 2/N apart, they would be 2/N/ci apart.  I refer to this figure, 2/N/ci, as the "granularity".

In order to maintain invariance in terms of rotation, we need to set the granularity of the sets to 2.cosθ/N with N->∞.  If there is a better way to word this, please let me know.


If there is an iron-clad rule that says that I cannot parameterise my chord selection with anything akin to this concept of granularity, then I guess I have to graciously concede defeat, albeit with the residue of the itchy feeling that maths shouldn't be like this.  But if it is possible, without necessarily being conventional, then I think my selection of chords makes sense, is invariant in terms of scale, translation and rotation and results in the 1/2 answer.  And while it does not seem quite as elegant as my first (incorrect) version, it is more general and I don’t know that an attempt to do something similar with the 1/3 and 1/4 methods can be done as elegantly.  Perhaps it can be done, perhaps there are even more elegant ways to do it, I'm in absolutely no way certain of this.

Wednesday, 25 November 2015

Triangular Circles - A Response to Mathematician

Mathematician wrote two long, much appreciated comments to Uniformly and/or Randomly Driving Towards One Half.  I'll reproduce them here, with only very slight formatting changes (I am not in favour of gaps between the final word in a question and the question mark).  My response follows below.

Mathematician Comment 1:

I'm glad to see that you are still trying to solve mathematical problems, but I am a little bit disappointed to see that you are still not using words as they should be used and that you still fall into common misconception about probabilities.

Your whole point seems to be that the answer 1/2 is THE answer to Bertrand's paradox, and that other methods of choosing a chord are skewed. I'm not sure that I could explain why this is not a valid point directly, so I'm really going to use your own words to make you see where your errors are.

First, the wording of the problem. Apparently, you seem to think that the sentence:

> If you pick, at random, a line which passes through the circle

is equivalent to the initial formulation:

> If you pick, at random, a chord of the circle

You even say: "I deliberately used different and simpler language, but I don't think that my wording introduces or omits anything of consequence." and this is a major flaw in your reasoning. It has many consequences. I would go even further, the Bertrand "paradox" (which is no paradox at all, but that's another question) is ultimately an example to understand that some sets do not come with a natural parametrization. If you change the wording of the problem by identifying the given set with another new set, then is is entirely possible that the new set has a natural parametrization.

What you are doing with your rephrasing of the problem is that you inadvertently choose a particular parametrization of the set of chords. You are actually identifying the set of chords with the set of lines which passes through the circle. Now, this parametrization is useful and natural, but it is in no way unique. There are many other way to identify the set of chords to some other set. For example, I can identify the set of chords and the set of pairs of point on the circle. I can also identify the set of chords and the set of points inside the circle. Both identifications are perfectly natural and very useful.

Imagine that I ask you the problem in the following way :

> If you pick, at random, two points on the circle, what is the probability that segment between these two points will be longer than the sides of the equilateral triangle?

I deliberately used different and simpler language, but I don't think that my wording introduces or omits anything of consequence. (Does this sentence remind you of something?) Is this wording of the problem less or more natural than your own wording? I argue that both are natural, but both are not equivalent to the original wording ...

It might seem obvious to YOU that your wording is more natural. It might seem more obvious to ME that my wording is more natural. But in the end, there is absolutely no reason to think that one is better than the other.

It seems that I need to cut my comment in two half because it is too long ... so I will get back after a short break

Mathematician Comment 2:

Now after this wording of the problem, you argue that the answer should be 1/2, referring to Jaynes treatment. I agree with you, but I don't think that you understand the words you are using:

> Jaynes appears to be suggesting one, appealing to rotational and scale invariance.

Let me ask you a question. I give you a precise circle. The set of chord of this precise circle is a well-defined set. Why does it mean for a measure on this set to be "scale invariant"?

The "scale and rotational invariance" is meant to be applied to a measure on the set of all lines in the plane. This is a completely different set from the set of chord on a specific circle. Now, what Jaynes meant is that if we choose a parametrization of chords as lines that passes through the circle, then we should put a measure on the set of chords that comes from a measure on the set of lines. Moreover, the measure chosen on the set of lines should not depend on the placement of the circle (translationally/rotationally invariant) and on the size of the circle (scale invariant). And fortunately, there is a unique measure on the set of all lines in the plane that satisfies these properties.

But I can choose a different parametrization of the set of chords, for example by the two endpoints on the circle. Then there is absolutely no reason to think that the measure on the set of chords should come from a measure on the set of lines. And the expression "scale invariant" is meaningless here, because the circle is fixed, and the circle is NOT scale invariant. The only thing that would make sense is "rotationally invariant". It turns out that there is a natural measure on the set of pairs of points on the circle, and it gives you a DIFFERENT measure on the set of chord.

> However, my argument here is that the set of chords selected by the 1/3 method is skewed.

Skewness is a relative notion. In this problem, there is no point of reference which would be the "unskewed" result.

The fact that you can get back the 1/2 answer by pulling one of the endpoint of the chord towards infinity is rather neat. But it is in no way an indication that 1/2 is a better result ...

> This is precisely the problem (in my humble opinion) with the 1/4 method, because Cartesian co-ordinates are used within a circle.

What??? Again your misunderstanding of basic mathematics is surfacing. There is absolutely no reason to use cartesian coordinate to define a uniform probability distribution on the disc. And moreover, you can use cartesian coordinate without "skewing" the result.

But more importantly your method of "randomly choosing a point in che circle":

> we would select, at random, an angle 0 > θ > 2π from the x axis and a distance from the locus of the circle, r, where 0 > r > R

does not produce the result that you probably think it produces. The probability measure that is induced by this process is not the uniform probability measure on the disc ...

If you were familiar with polar coordinates, you would know that the usual area is given by "r dr dθ", but what you did was taking the measure given by "dr dθ" (I'm simplifying the argument because it seems irrelevant to make a full course on measure theory here). Of course, your mistake is "lucky" because it produced the result that you wanted : 1/2. But the mistake is real, and hence the result is relatively meaningless.

In conclusion, your main mistake is to think that there is only one natural parametrization of the set of chords. But even if there were a "best" parametrization, it is irrelevant in the context. You should understand that the Bertrand paradox is not about chords at all ...

Perhaps I should give you another example of a similar problem:

* Pick, at random, a right triangle inscribed inside a circle of diameter 1. What is the probability that one angle of the triangle is less than pi/6?

How would you answer the question?

neopolitan's response:

First off, I have to repeat that I am not a professional mathematician and have no intention of spending another six years or more at university to become a Doctor of Mathematics.  The last time I checked, the vast majority of the world's population, maybe even the population of the universe, are not maths docs.  Therefore, I think it is not unreasonable that I don't always use the correct terminology agreed to within the mathematical cabal.

It is possible that you are intentionally sending a message along the line of "get a maths doctorate or shut the fuck up" but I don't think you are.  Unintentionally, however, this is precisely the message you seem to be sending with some of your comments (some have been far worse, perhaps with intent).  I hope you take this with in the spirit with which it is intended - I am curious, other people are curious, and we curious people don't need to be frapped down by supercilious experts for the most minor of infractions.  If at all possible, it’s better to get to the meat of our misunderstandings.  And I was not being sarcastic when I wrote that your comments are much appreciated.

You make a comment about lines and chords, as if I don't know the difference.  Please see A Farewell to the Bertrand Paradox in which I think I make it pretty clear that I understand the difference (I repeatedly wrote "You now have two points on the circle, between which is a chord") - although in retrospect it seems like I don't understand the word "farewell".  My implication, although not clearly expressed, is that a line passing through a circle defines a chord and a chord (of length greater than zero) uniquely defines a line.  If this is fundamentally wrong (as opposed to just oddly phrased), please advise.

You then get into parameterisation (or perhaps parametrization, if there is a meaningful difference beyond our spelling preferences).  To the extent that I understand you, I think I agree.  The method you use to define your chords, to select your chords, makes a difference.  What we differ on is whether there is a "right" and "wrong" way to select chords to satisfy the Bertrand Paradox, as stated.  My position, which I accept may be fundamentally wrong, is that if your method doesn’t arrive at a uniform distribution of chords (by some reasonable measure), then your method isn't "at random".

The question that arises immediately is by what "reasonable measure" can I claim that the 1/2 method produces a uniform distribution of chords and the other two methods don't.  Below I may go some way to explain what I had in my head, but first … invariance.

I suspect that we would agree that the distribution of chords in a circle is invariant in terms of rotation, translation and scale.  To clarify, imagine a circle with a locus at (0,0), defined by radius of 1 and an orientation such that θ=0 aligns with the positive y-axis (let's set Point A to (0,1)=(1,0), if you get my little joke [and yes, I know it’s more conventional to take θ from the x-axis, but it's just a convention and my convention has the vertex of the triangle, and thus Point A, at the top]).

The distribution of chords in that circle (given the same parameterisation) is not affected if: we move the circle (to a locus at (random(x),random(y)); rotate the circle (to an orientation at random(θ) from the y-axis); or increase the size of the circle (to a radius of random(R)).  I think we would agree on that, but I may be missing something.

So long as I am not missing something crucial, therefore, a circle of the type I suggested (locus at (0,0) and (r=1,θ=0) coinciding with (x=0,y=1)) can stand in for circles of any size and location and Point A at (0,1)=(1,0) can represent all possible positions on the circumference, because using any other position on the circumference is equivalent to this circle being rotated.  If I am wrong about this, then the following probably falls apart.

Borrowing from myself (at reddit), my "reasonable measure" of a uniform distribution is such that if you drew a representative sample of the chords with an arbitrarily small width (I am aware that chords don't have widths), then the resultant density would be smooth throughout the circle.

Wikipedia has something close to what I am talking about, just before they get into the "classical solution". However, I limit my visualisation of it to one orientation of the circle, so
  • 1) put Point 1 at the apex of a [notional] equilateral triangle, draw a set of chords with an arbitrarily small angular separation (say 360 of them 1 degree apart)
  • 2) select a radius and extend it out to a diameter, draw the set of chords perpendicular to the diameter with an arbitrarily small separation, say 360 of them R/180 apart
  • 3) take an arbitrarily large number of points equally separated within the circle, say 360 of them, draw the set of chords for which the points are their midpoint


I am pretty sure that only 2) will smoothly fill the circle (for example if all chords are drawn with a width of R/360). I am also pretty sure that there will be some arcane argument as to why this either doesn't matter, is not (sufficiently) stringent or is completely wrong-headed - but as I said, it is what I had in mind.

I don't want to spend a ridiculous amount of time on this, so for the purposes of showing this, I will use 16 rather than 360 chords and I am not going to fuss about making the images pretty:

1)


2)


3)


To my mind, the distribution created by method 2 is uniform and smooth in a way that the others simply aren't.  Someone did complain that my argument here is apparently based on aesthetics - the method 2 result looks nicer.  That's not really my point, my point is that the density of chords is smooth, no matter where you look in the circle.  In the other two, either you have a clumping of chords (in method 1 there are more chords surrounding a point directly below the top of the circle [it's actually worse than I've represented]) or (in method 3) chords crossing.  I do understand that as you keep rotating the circle to obtain a new set of chords in method 1, as your number of unique sets approaches infinity, the gaps will disappear, but the clumping will remain at the rim of the circle (some seem to want to call it a disc, hopefully I don't confuse anyone by calling it a circle).  Similarly, I understand that the gaps will disappear if we consider more and more chord midpoints in method 3, but this introduces a similar clumping effect at the rim of the circle - to a greater extent, by which I mean the density of chords at the rim will be even greater than produced by method 1.

You gave the following challenge: "Pick, at random, a right triangle inscribed inside a circle of diameter 1. What is the probability that one angle of the triangle is less than pi/6?"

I understand that this is the same question, because 2.cos(π/6) is √3, so to be consistent, my answer would have to be 1/2.  This is, however, on the understanding that I am picking from an existent set of all right triangles, not a set of triangles created by bisecting the circle and then picking a point at random on the circumference then drawing chords between that point and the two ends of the diameter previously established.

I think you've helped me here though.  If we think about the absolute maximum proportion of unique right triangles with one angle less than π/6 that we could draw in the circle, using any method, the answer comes to 1/2.  We could, for example, draw the set of right triangles using the 1/2 method for chord selection, then draw a diameter from one end of the chord, then complete the triangle.  This, I think you would agree, results in 1/2.  Knowing this, it seems odd to me that you can walk back from this figure to 1/3.

I note that, when unconstrained by a circle, can you draw your triangle by first drawing the hypotenuse (say at length 2R), then drawing a line from one end at an angle to the hypotenuse chosen at random such that 0 < θ < π/2 and then completing the triangle by drawing the final side as required.  The criterion will be satisfied in the ranges 0 < θ < π/6 and π/3 < θ < π/2, so 2/3 of the time.

Also, as you do this, the right angle vertex describes a circle - meaning that there is a problem with the 1/3 answer.  I've drawn that below as well, hopefully it's sufficiently clear (note that I have tried to show pi/3 around the created circle, I am not trying to imply that pi/6 (the angle on the left in the triangle at this point) is pi/3).


Thinking about this did lead me to stumble over another way to create a set of chords: draw a diameter (just line segment of length 2R will do), then repeatedly draw circles around one end of the diameter (or line segment) at arbitrarily small increments until you reach of circle of radius 2R.  With each circle, draw the tangent that intersects with the other end of the diameter (or line segment).  The points at which these tangents touch each of this series of circles describe a semicircle.

The result obtained by this method is 1/2, despite not appearing to be a reformulation of the classic 1/2 method.  Perhaps it is but I’ve simply not worked out how yet, but in a rough modelling (2000 data points) the distribution does not appear to be the same when viewed in histogram form.

---

On parameterisation, I did some thinking about this along the lines of saying that if you have a 1/3 answer, then it seems (to me) that your selection method must simply have missed some of the chords.  In my way of thinking (standard caveat about the possibility of being wrong), if we are asked to select a chord "at random" then it follows that we would be selecting from a set of ALL chords, rather than from a specific subset, unless advised otherwise.  Thought from this perspective, our first concern is making sure that we have ALL chords available to select from.  The question then is how to express this properly.  I'm probably going to mess this up in some obscure way, but if you can at least try to understand what I am saying (and criticise the best formulation of my argument, rather than the worst), it would be appreciated.

I suggest that an expression for ALL chords in a circle defined by x2+y2=1 (in units of R where R is the radius of the circle) goes something like this:

The infinite set S of all unique sets Si of points that fulfil the following criteria:

S:

-1 > c > 1 (defining the y axis intercept of the chord)

0 > θ > 2π (defining the gradient of the chord)

         Si:

-√((-cosθ)2+(c-sinθ)2) > r > √((cosθ)2+(c+sinθ)2)

(x,y) = (r.cosθ,r.sinθ+c)

Note: the combined effect of these two conditions is (or is intended) to include all and only points between intercepts of the line defined by (x,y) = (r.cosθ,r.sinθ+c) and the circle defined by x2 + y2 = 1, thus defining a chord.  In other words a unique set Si is intended to define a unique chord.

When corrected in terms of mathematical terminology, etc, is this a parameterisation and, if so, does it establish or define a structure (per u/Vietoris) for which there is a defined probability measure (per u/Vietoris) or probability distribution (per u/overconvergent)?  And, if so, what Bertrand Paradox related answer would be expected from this parameterisation and associated probability measure/distribution?

---


With luck, I have already addressed your other points either here, or in comments at reddit.  If I have missed something key, please let me know.

Monday, 23 November 2015

Uniformly and/or Randomly Driving Towards One Half

Despite having thought that I'd said all that I wanted to say about the Bertrand Paradox, and having provided what I thought was a definitive case for 1/2 in A Farewell to the Bertrand Paradox as recent bout of curiosity dragged me back in. 

(Someone has asked me what I wanted to prove in Three New Wrong Answers for Bertrand.  I didn't intend to prove anything at all, I was just indulging that curiosity.)

I posted a link to Three New Wrong Answers for Bertrand and asked a question (at r/math which then got picked up at r/badmathematics) and then the great piling-on commenced once more.  (Note that I do recognise that a couple of people did say that the question was not overly bad and I do recognise that my errors and stubborn idiocy associated with goats totally warranted the great piling-on that happened about three months ago.)

One of the issues that has been raised, a few times, is that of "random" versus "uniform".  Another is "natural" (which in mathematical terms seems to be interchangeable, at least in part, with "canonical").

I find this curious, since the phrasing of the Bertrand Paradox seems to never include the terms "uniform" or "natural".  But I realise that assumption of "uniformity" and "naturalness" may have some bearing.

My initial posing of the question was:

Say you have a circle in which there is an equilateral triangle, like this.

If you pick, at random, a line which passes through the circle, what is the probability that the section of your line that lies within the circle will be longer than the sides of the equilateral triangle?

The Wikipedia wording is:

The Bertrand paradox goes as follows: Consider an equilateral triangle inscribed in a circle. Suppose a chord of the circle is chosen at random. What is the probability that the chord is longer than a side of the triangle?

I deliberately used different and simpler language (in part to include non-mathematicians and in part to make it a little more difficult to google an answer within seconds), but I don't think that my wording introduces or omits anything of consequence.

If I am in error here, then please feel free to enlighten me.

One phrase which I deliberately didn't change was "at random" (although I did use the verb "to pick" while Wikipedia went with "to choose", and in the following I'll even spice things up occasionally by using "to select").  A question here, that many have asked, and I need to answer, is "what do I mean by 'at random'".

I could wave vaguely at Wikipedia and their claim that Bertrand phrased the problem in terms of "at random" and say that I mean what they mean.  I could note that Jaynes also referred to "at random" as supposedly used by Bertrand and goes on to state that Bertrand himself didn't suggest that any of the three answers were "correct", because " the problem has no definite solution because it is ill posed, the phrase 'at random' being undefined".

So if I did wave vaguely at Wikipedia and Bertrand, then I could just be saying that "at random" doesn't actually mean anything specifically.  But I would have thought that in the 125 years or so since Bertrand wrote Calcul des probabilités, we might have come up with an appropriate definition.  Jaynes appears to be suggesting one, appealing to rotational and scale invariance.  His treatment is a little frightening to someone reading it without a mathematics degree in their pocket, but I think it aligns with how I think of it.  Again, feel free to let me know if I have it wrong.

When I think of "at random", I do apparently think of "uniform".  For example, say that I had 12 balls of two colours:


If I were to repeatedly pick a ball "at random", I would expect to get a green ball about half the time.  If I were to pick one ball "at random", and one ball only, I would put the probability that the ball would be green at 1/2.  If I repeatedly picked a ball "at random" and got any other answer, say a third of the time I get a green ball, I would suspect that my process for selecting "at random" was flawed - being skewed against the selection of green balls.

I've put some extra detail on the green balls to try to explain what I think might be a problem in a selection process.  Say I get my colour-blind friend (Ginger) to select balls at random and write down how many are green, having advised her (accurately enough) that the blue balls are unmarked.  She'll perhaps be a little confused at first, but will quickly catch on that some balls have G on them and will arrive at the incorrect conclusion that 1/3 of the balls are green.  In this case, I think we'd agree that something about the process skewed the result against selection of green balls.  Or more accurately, against identification of green balls - sometimes Ginger had a green ball in her hand, but she rejected it, not because it wasn't green, but because it didn't have G on it (and being colour-blind, she was relying on this in her identification process).

Then say I get my totally blind friend (Magenta) to help out, advising her (again accurately) that the blue balls have no braille on them.  She'll come to the conclusion that 1/4 of the balls are green - and again this is because she rejected balls not because they weren't green, but because they didn't have G on them in braille.

I fully accept that these are bad processes, but that's partly my point.  The processes result in tossing out of positive hits (green balls) and arrive at a skewed result.  In A Farewell to the Bertrand Paradox I argue that positive hits are tossed out in the 1/3 and 1/4 methods.  (Note that from here on in, I'll be referring to the more traditional methods of selecting chords at random as "the 1/3 method", "the 1/2 method" and "the 1/4 method".  I'm also going to be assuming some familiarity with these methods.  Go back to The Circle, Triangle and Random Line with an Answer if you need to gain that familiarity.)

I further argue that the processes for the 1/3 method and the 1/4 method can be "corrected".

The 1/3 method involves selecting two points on the circumference and drawing a chord between them.  The probability can be calculated by imagining the equilateral triangle rotated until its vertex aligns with one of the points.  Then consider the likelihood that the other point lies in the region that leads to a chord that is longer than √3R (where R is the radius of the circle and thus √3R is the length of the sides of the equilateral triangle).



To give you a chord that is longer than √3R, the second point has to lie below the triangle, along a third of the circumference of the circle.  So, it appears that the answer is 1/3.

However, my argument here is that the set of chords selected by the 1/3 method is skewed.  To show how this skewing can be removed, imagine a slightly different process: draw a line through the centre of both the circle and the triangle.  Then consider a point on that line that is our first point (Point 1).  We could put it on the circumference of the circle, which certainly seems reasonable, but we could put it somewhere else that might be even more reasonable - at infinity.

Consider the chords that can be drawn through the circle and be continued on to pass through Point 1 at infinity.  The probability of one of these chords, selected at random, being longer than √3R is 1/2.  The explanatory figure below is on its side for convenience.



The problem with using the point on the circumference is that it skews the set of chords towards those which lie close to the first point.  Here is a graphical representation of this effect:



In a histogram the effect looks like this:



When you use my method (with y=10R standing in for y=∞), the results look like this:



And, in a histogram, like this:


A close inspection indicates that there are significantly more very short chords with the 1/3 method than there is with the modification of that method that arrives at 1/2.

Of course I am aware that 10R is in no sense close to infinity, but my point here is that the answer rapidly approaches 1/2 as we move Point 1 away from the circumference of the circle - it's already about 0.49 by the time Point 1 is at 5R.  We can certainly use a significantly more distant Point 1, for example, we get results like this for Point 1 at 1000000R:




At this point, it's probably best to take a quick look at the graphical representations of the results for the 1/2 method to see how they compare:



They seem to be identical.  I have to hold my hand up here though and admit that I have done something to highlight the similarity.  The standard 1/2 method is to take a radius at random (which is representative of all radii) and then select a point on this radius at random.  This point is then used as the midpoint of a chord.  My graphical representations above are done the same way but with a diameter (crossing the entirety of the circle, rather than just half of it).  Here they are with a radius (at the same granularity):


It's only one half of the shape, but this is not catastrophic because each half is a mirror of the other and each radius can be continued into a diameter.  And in any event, this shape isn't key.  What is key is the histogram, which looks like this for the radius (at the same granularity):


If I modify the granularity on the histogram by taking twice as many samples in the radius, we get:


… which is back to being identical to the "corrected" 1/3 method histogram.

The 1/4 method involves picking a point at random in the circle which is then used as the midpoint of a chord.  All chords of a length less than 2R are uniquely defined by their midpoint and all chords of length 2R share the same midpoint (at the locus of the circle).

One of my interlocutors at r/math, u/DR6, made a comment about there being problems when "using squares for a problem that is about circles".

This is precisely the problem (in my humble opinion) with the 1/4 method, because Cartesian co-ordinates are used within a circle.  This is equivalent to chopping up the circle into arbitrarily small squares and using the centre of each of the squares as the set from which random points may be selected.

Given that we are talking about a circle, it is far more reasonable to use polar co-ordinates.  This would mean rather than selecting, at random, values of x and y such that y2+x2<=R2, we would select, at random, an angle 0 > θ > 2π from the x axis and a distance from the locus of the circle, r, where 0 > r > R.

Such a scheme automatically makes this method identical to the 1/2 method, remembering the 1/2 method involves picking a point at random on a representative radius.  The representative radius represents all possible values of θ.

The benefit of the polar coordinate scheme is inherent in the notion that the midpoint thus determined also brings with it an orientation - perpendicular to the radius (so perpendicular to θ) - including the midpoint with r=0.  This does not happen when selecting points at random using Cartesian coordinates - if (x,y)=(0,0) we have no idea what orientation the related chord should have, thus unlike all other midpoints, the locus of the circle defines an infinite number of possible chords of length 2R.  For this very reason, we have cause to think that this method is not "natural".

A similar problem exists with the 1/3 method, when the second point is collocated with Point 1.  There is an infinite number of possible chords of length 0.  Therefore, we have cause to think that this method is not "natural".

So far as I can tell, the 1/2 method and the "corrected" variants of the 1/3 and 1/4 methods do not suffer from this problem, and therefore could be put forward as possibly "natural" methods.  I'm not saying definitively that they are - but I think it is quite reasonable to say that the others are not.

---


In Three New Wrong Answers for Bertrand I arrived at 1/2 for a method involving selecting two points (using Cartesian coordinates) inside a boxed circle.  However, even after a million iterations, it's not still actually quite 1/2 - it's 0.52-ish.  This is close to 1/2, but's not close enough to count as 1/2 as far as I am concerned - and in any event, the histogram looks wrong.  While I might be being hasty, I suspect that something is wrong this method even if it arrives at an answer that is tantalisingly near to 1/2.