Mathematician wrote two long, much appreciated comments to Uniformly and/or Randomly Driving Towards
One Half. I'll reproduce them
here, with only very slight formatting changes (I am not in favour of gaps
between the final word in a question and the question mark). My response follows below.
Mathematician Comment 1:
I'm glad to see that you are still trying to solve mathematical
problems, but I am a little bit disappointed to see that you are still not
using words as they should be used and that you still fall into common
misconception about probabilities.
Your whole point seems to be that the answer 1/2 is THE answer to
Bertrand's paradox, and that other methods of choosing a chord are skewed. I'm
not sure that I could explain why this is not a valid point directly, so I'm
really going to use your own words to make you see where your errors are.
First, the wording of the problem. Apparently, you seem to think
that the sentence:
> If you pick, at random, a line which passes through the circle
is equivalent to the initial formulation:
> If you pick, at random, a chord of the circle
You even say: "I deliberately used different and simpler
language, but I don't think that my wording introduces or omits anything of
consequence." and this is a major flaw in your reasoning. It has many
consequences. I would go even further, the Bertrand "paradox" (which
is no paradox at all, but that's another question) is ultimately an example to
understand that some sets do not come with a natural parametrization. If you
change the wording of the problem by identifying the given set with another new
set, then is is entirely possible that the new set has a natural parametrization.
What you are doing with your rephrasing of the problem is that you
inadvertently choose a particular parametrization of the set of chords. You are
actually identifying the set of chords with the set of lines which passes
through the circle. Now, this parametrization is useful and natural, but it is
in no way unique. There are many other way to identify the set of chords to
some other set. For example, I can identify the set of chords and the set of
pairs of point on the circle. I can also identify the set of chords and the set
of points inside the circle. Both identifications are perfectly natural and
very useful.
Imagine that I ask you the problem in the following way :
> If you pick, at random, two points on the circle, what is the
probability that segment between these two points will be longer than the sides
of the equilateral triangle?
I deliberately used different and simpler language, but I don't
think that my wording introduces or omits anything of consequence. (Does this
sentence remind you of something?) Is this wording of the problem less or more
natural than your own wording? I argue that both are natural, but both are not
equivalent to the original wording ...
It might seem obvious to YOU that your wording is more natural. It
might seem more obvious to ME that my wording is more natural. But in the end,
there is absolutely no reason to think that one is better than the other.
It seems that I need to cut my comment in two half because it is
too long ... so I will get back after a short break
Mathematician Comment 2:
Now after this wording of the problem, you argue that the answer
should be 1/2, referring to Jaynes treatment. I agree with you, but I don't
think that you understand the words you are using:
> Jaynes appears to be suggesting one, appealing to rotational
and scale invariance.
Let me ask you a question. I give you a precise circle. The set of
chord of this precise circle is a well-defined set. Why does it mean for a
measure on this set to be "scale invariant"?
The "scale and rotational invariance" is meant to be
applied to a measure on the set of all lines in the plane. This is a completely
different set from the set of chord on a specific circle. Now, what Jaynes
meant is that if we choose a parametrization of chords as lines that passes
through the circle, then we should put a measure on the set of chords that
comes from a measure on the set of lines. Moreover, the measure chosen on the
set of lines should not depend on the placement of the circle
(translationally/rotationally invariant) and on the size of the circle (scale
invariant). And fortunately, there is a unique measure on the set of all lines
in the plane that satisfies these properties.
But I can choose a different parametrization of the set of chords,
for example by the two endpoints on the circle. Then there is absolutely no
reason to think that the measure on the set of chords should come from a
measure on the set of lines. And the expression "scale invariant" is
meaningless here, because the circle is fixed, and the circle is NOT scale
invariant. The only thing that would make sense is "rotationally
invariant". It turns out that there is a natural measure on the set of
pairs of points on the circle, and it gives you a DIFFERENT measure on the set
of chord.
> However, my argument here is that the set of chords selected
by the 1/3 method is skewed.
Skewness is a relative notion. In this problem, there is no point
of reference which would be the "unskewed" result.
The fact that you can get back the 1/2 answer by pulling one of the
endpoint of the chord towards infinity is rather neat. But it is in no way an
indication that 1/2 is a better result ...
> This is precisely the problem (in my humble opinion) with the
1/4 method, because Cartesian co-ordinates are used within a circle.
What??? Again your misunderstanding of basic mathematics is
surfacing. There is absolutely no reason to use cartesian coordinate to define
a uniform probability distribution on the disc. And moreover, you can use
cartesian coordinate without "skewing" the result.
But more importantly your method of "randomly choosing a point
in che circle":
> we would select, at random, an angle 0 > θ > 2π from the
x axis and a distance from the locus of the circle, r, where 0 > r > R
does not produce the result that you probably think it produces.
The probability measure that is induced by this process is not the uniform
probability measure on the disc ...
If you were familiar with polar coordinates, you would know that
the usual area is given by "r dr dθ", but what you did was taking the
measure given by "dr dθ" (I'm simplifying the argument because it seems
irrelevant to make a full course on measure theory here). Of course, your
mistake is "lucky" because it produced the result that you wanted :
1/2. But the mistake is real, and hence the result is relatively meaningless.
In conclusion, your main mistake is to think that there is only one
natural parametrization of the set of chords. But even if there were a
"best" parametrization, it is irrelevant in the context. You should
understand that the Bertrand paradox is not about chords at all ...
Perhaps I should give you another example of a similar problem:
* Pick, at random, a right triangle inscribed inside a circle of
diameter 1. What is the probability that one angle of the triangle is less than
pi/6?
How would you answer the question?
neopolitan's response:
First off, I have to repeat that I am not a professional
mathematician and have no intention of spending another six years or more at
university to become a Doctor of Mathematics.
The last time I checked, the vast majority of the world's population,
maybe even the population of the universe, are not maths docs. Therefore, I think it is not unreasonable
that I don't always use the correct terminology agreed to within the
mathematical cabal.
It is possible that you are intentionally sending a message
along the line of "get a maths doctorate or shut the fuck up" but I
don't think you are. Unintentionally,
however, this is precisely the message you seem to be sending with some of your
comments (some have been far worse, perhaps with intent). I hope you take this with in the spirit with
which it is intended - I am curious, other people are curious, and we curious
people don't need to be frapped down by supercilious experts for the most minor
of infractions. If at all possible, it’s
better to get to the meat of our misunderstandings. And I was not being sarcastic when I wrote
that your comments are much appreciated.
You make a comment about lines and chords, as if I don't
know the difference. Please see A Farewell to the Bertrand Paradox
in which I think I make it pretty clear that I understand the difference (I
repeatedly wrote "You now have two points on the circle, between which is
a chord") - although in retrospect it seems like I don't understand the
word "farewell". My
implication, although not clearly expressed, is that a line passing through a
circle defines a chord and a chord (of length greater than zero) uniquely
defines a line. If this is fundamentally
wrong (as opposed to just oddly phrased), please advise.
You then get into parameterisation (or perhaps
parametrization, if there is a meaningful difference beyond our spelling
preferences). To the extent that I
understand you, I think I agree. The
method you use to define your chords, to select your chords, makes a
difference. What we differ on is whether
there is a "right" and "wrong" way to select chords to
satisfy the Bertrand Paradox, as stated.
My position, which I accept may be fundamentally wrong, is that if your
method doesn’t arrive at a uniform distribution of chords (by some reasonable
measure), then your method isn't "at random".
The question that arises immediately is by what
"reasonable measure" can I claim that the 1/2 method produces a
uniform distribution of chords and the other two methods don't. Below I may go some way to explain what I had
in my head, but first … invariance.
I suspect that we would agree that the distribution of
chords in a circle is invariant in terms of rotation, translation and
scale. To clarify, imagine a circle with
a locus at (0,0), defined by radius of 1 and an orientation such that θ=0
aligns with the positive y-axis (let's set Point A to (0,1)=(1,0), if you get
my little joke [and yes, I know it’s more conventional to take θ from the
x-axis, but it's just a convention and my convention has the vertex of the
triangle, and thus Point A, at the top]).
The distribution of chords in that circle (given the same
parameterisation) is not affected if: we move the circle (to a locus at
(random(x),random(y)); rotate the circle (to an orientation at random(θ) from
the y-axis); or increase the size of the circle (to a radius of
random(R)). I think we would
agree on that, but I may be missing something.
So long as I am not missing something crucial, therefore, a
circle of the type I suggested (locus at (0,0) and (r=1,θ=0) coinciding with
(x=0,y=1)) can stand in for circles of any size and location and Point A at
(0,1)=(1,0) can represent all possible positions on the circumference, because
using any other position on the circumference is equivalent to this circle
being rotated. If I am wrong about this,
then the following probably falls apart.
Borrowing from myself (at reddit), my "reasonable
measure" of a uniform distribution is such that if you drew a
representative sample of the chords with an arbitrarily small width (I am
aware that chords don't have widths), then the resultant density would be
smooth throughout the circle.
Wikipedia has something close to
what I am talking about, just before they get into the "classical
solution". However, I limit my visualisation of it to one orientation of
the circle, so
- 1) put Point 1 at the apex of a
[notional] equilateral triangle, draw a set of chords with an arbitrarily small
angular separation (say 360 of them 1 degree apart)
- 2) select a radius and extend it
out to a diameter, draw the set of chords perpendicular to the diameter with an
arbitrarily small separation, say 360 of them R/180 apart
- 3) take an arbitrarily large
number of points equally separated within the circle, say 360 of them, draw the
set of chords for which the points are their midpoint
I am pretty sure that only 2) will smoothly fill the circle
(for example if all chords are drawn with a width of R/360). I am also pretty
sure that there will be some arcane argument as to why this either doesn't
matter, is not (sufficiently) stringent or is completely wrong-headed - but as
I said, it is what I had in mind.
I don't want to spend a ridiculous amount of time on this,
so for the purposes of showing this, I will use 16 rather than 360 chords and I
am not going to fuss about making the images pretty:
1)
2)
3)
To my mind, the distribution created by method 2 is uniform
and smooth in a way that the others simply aren't. Someone did complain that my
argument here is apparently based on aesthetics - the method 2 result looks
nicer. That's not really my point, my
point is that the density of chords is smooth, no matter where you look in the
circle. In the other two, either you
have a clumping of chords (in method 1 there are more chords surrounding a
point directly below the top of the circle [it's actually worse than I've represented]) or (in method 3) chords crossing. I do understand that as you keep rotating the
circle to obtain a new set of chords in method 1, as your number of unique sets
approaches infinity, the gaps will disappear, but the clumping will remain at
the rim of the circle (some seem to want to call it a disc, hopefully I don't
confuse anyone by calling it a circle).
Similarly, I understand that the gaps will disappear if we consider more
and more chord midpoints in method 3, but this introduces a similar clumping
effect at the rim of the circle - to a greater extent, by which I mean the density
of chords at the rim will be even greater than produced by method 1.
You gave the following challenge: "Pick, at random, a
right triangle inscribed inside a circle of diameter 1. What is the probability
that one angle of the triangle is less than pi/6?"
I understand that this is the same question, because 2.cos(π/6) is √3, so to be
consistent, my answer would have to be 1/2.
This is, however, on the understanding that I am picking from an
existent set of all right triangles, not a set of triangles created by
bisecting the circle and then picking a point at random on the circumference
then drawing chords between that point and the two ends of the diameter
previously established.
I think you've helped me here though. If we think about the absolute maximum
proportion of unique right triangles with one angle less than π/6 that we could draw in the circle,
using any method, the answer comes to 1/2.
We could, for example, draw the set of right triangles using the 1/2
method for chord selection, then draw a diameter from one end of the chord,
then complete the triangle. This, I
think you would agree, results in 1/2.
Knowing this, it seems odd to me that you can walk back from this figure
to 1/3.
I note that, when unconstrained by a circle, can you draw your triangle by first drawing the
hypotenuse (say at length 2R), then drawing a line from one end at an angle to
the hypotenuse chosen at random such that 0 < θ < π/2 and then completing the triangle
by drawing the final side as required.
The criterion will be satisfied in the ranges 0 < θ < π/6 and π/3 < θ < π/2, so 2/3 of the time.
Also, as you do
this, the right angle vertex describes a circle - meaning that there is a
problem with the 1/3 answer. I've drawn
that below as well, hopefully it's sufficiently clear (note that I have tried
to show pi/3 around the created circle, I am not trying to imply that pi/6 (the
angle on the left in the triangle at this point) is pi/3).
Thinking about this did lead me to stumble over another way
to create a set of chords: draw a diameter (just line segment of length 2R will
do), then repeatedly draw circles around one end of the diameter (or line
segment) at arbitrarily small increments until you reach of circle of radius
2R. With each circle, draw the tangent
that intersects with the other end of the diameter (or line
segment). The points at which these
tangents touch each of this series of circles describe a semicircle.
The result obtained by this method is 1/2, despite not
appearing to be a reformulation of the classic 1/2 method. Perhaps it is but I’ve simply not worked out
how yet, but in a rough modelling (2000 data points) the distribution does not
appear to be the same when viewed in histogram form.
---
On parameterisation, I did some thinking about this along
the lines of saying that if you have a 1/3 answer, then it seems (to me) that
your selection method must simply have missed some of the chords. In my way of thinking (standard caveat about
the possibility of being wrong), if we are asked to select a chord "at
random" then it follows that we would be selecting from a set of ALL
chords, rather than from a specific subset, unless advised otherwise. Thought from this perspective, our first
concern is making sure that we have ALL chords available to select from. The question then is how to express this
properly. I'm probably going to mess this
up in some obscure way, but if you can at least try to understand what I am
saying (and criticise the best formulation of my argument, rather than the
worst), it would be appreciated.
I suggest that an expression for ALL chords in a circle
defined by x2+y2=1
(in units of R where R is the radius of the circle) goes something like
this:
The infinite set S of all unique sets Si of points that
fulfil the following criteria:
S:
-1 > c > 1 (defining the y axis
intercept of the chord)
0 > θ > 2π (defining the gradient of the chord)
Si:
-√((-cosθ)2+(c-sinθ)2) >
r > √((cosθ)2+(c+sinθ)2)
(x,y) = (r.cosθ,r.sinθ+c)
Note: the combined effect of these two conditions is (or is intended) to
include all and only points between intercepts of the line defined by (x,y) = (r.cosθ,r.sinθ+c)
and the circle defined by x2 + y2 = 1, thus defining a
chord. In other words a unique set Si
is intended to define a unique chord.
When corrected in
terms of mathematical terminology, etc, is this a parameterisation and, if so,
does it establish or define a structure (per u/Vietoris) for which there is a
defined probability measure (per u/Vietoris) or probability distribution
(per u/overconvergent)? And, if so, what Bertrand Paradox related answer
would be expected from this parameterisation and associated probability measure/distribution?
---
With luck, I have already addressed your other points either
here, or in comments at reddit. If I
have missed something key, please let me know.