Tuesday, 1 December 2015

Rectangular Circles - Yet Another Response to Mathematician

This is yet another response to Mathematician.  Here's what he wrote, interspersed with my responses (minor editing for format, and I've excised the final part that I've already addressed back in the comments section which can be read in its entirety here):

> At N=100, the 1/2 method does not have gaps or clumping.

The whole point of my "subintervals in intervals" example, was to show you that the problem of "gaps or clumping" is only a problem if you think it is. In the interval example, if you require that there is no "gaps or clumping" when N is finite, then the only possible answer is 0.

I think we agree that this is not reasonable at all. And that the most obvious way to select a subinterval will give gaps and clumping.

So how do you choose A PRIORI, in which contexts "gaps or clumping" are problematic, and in which contexts they are not?

It seems to me that you are blending the discussion about the disc and the discussion about the interval.

The "1/2 method" that I am talking about refers to the disc (and only the disc).  I am pretty certain that you are clear on this, but I want to be as totally certain as I can be.

I'm not as convinced as you are that the only possible proportion of subintervals greater than L/2 is zero when using a method that eliminates gaps and clumping.  What I am pretty sure of, however, is that if we looked at the distribution of subintervals and found that they were clustered around the ends of the interval and their lengths clustered around what could be described as "very very short" (much less than L/2), then we'd have reason to doubt how fair this distribution was.

Perhaps there's a good mathematical reason to not care about such clumping (and the implied "gap" between the ends of the interval in which the density of subintervals would dip), but don't you agree that using such a distribution would not meet the general understanding of "at random" - perhaps not even your own understanding of what "at random" would mean in this context?

> if distribution continued towards an as yet unknown value, or whether it still approaches zero

As far as I understand what you are trying to do, I'm pretty sure that it will approach 0.

I'm not sure that what you are doing proves anything at all, but that's another problem.

I wasn't trying to prove anything.  I was just pondering the puzzle that you presented.

> Perhaps it was not clear to you, but the "corrected" 1/3 method, ends up being the 1/2 method

No it was clear.

> And, no, I don't agree that it's the same thing

Let me repeat something for sake of clarity:
For any (c,θ) in [-1,1]x[0,pi], there exists a unique chord that is at distance c from the center, in direction θ.

The "1/2 method of selecting a chord", amounts to pick a couple (c,θ) uniformly in the rectangle [-1,1]x[0,pi]. Do we agree on that?

When you draw your picture to show "granularity", what you are doing is that you choose a θ, once and for all, and then you take 100 values of c that are evenly spaced in [-1,1].

What I'm suggesting is that you do the opposite: Choose a c, once and for all, and then take 100 values of θ that are evenly spaced in [0,pi].

In the end, this is exactly the same method, but you're not drawing the same picture. (In mathematical terms, you are just doing a projection on one of the coordinates)

This is possibly where the meat of the issue is.

I agree that for any (c,θ) in [-1,1]x[0,π], there exists a unique chord that is at distance c from the centre, in direction θ.  To be absolutely clear, I am interpreting this to mean that you are talking about a chord that is offset from the locus by c at its midpoint and that, therefore, the direction mentioned is the direction from the locus to that midpoint.

This is not what I thought you meant before.  I thought you meant to pick a point at c from the locus (direction irrelevant), and then consider the chords that pass through that point with gradients defined by θ.  You'd agree that such a scheme, picking a single value of c, won't give you ALL the chords (certainly not if you pick any value of c less than R, being the radius of the disc), right?

However, you seem to misunderstand my intention.  I made clear (somewhere, I can dig it up if you insist) that I was notionally selecting a single value of θ (direction from locus to midpoint) only because that single value can represent all possible values of θ.  The same applies when selecting a single Point 1 on the circumference in the 1/3 method.

I fully expect that, to get the ALL the chords, you’d have to consider all possible values of θ - in no way was I suggesting that I should "choose a value of θ, once and for all".

So, I understand that if someone foolishly suggested that we select a value of c, "once and for all", and then look at the chords at c given all possible values of θ (as a direction from the locus to the midpoint of a chord), then you'll never get ALL chords.  You'll get an infinite number of chords with precisely the same length but different gradients.

Perhaps I am still misunderstanding your point.  I think I must be, because I do not believe that you are this foolish (insert smile here to minimise any unintended offense).

I want to step back a bit to your question:

The "1/2 method of selecting a chord", amounts to pick a couple (c,θ) uniformly in the rectangle [-1,1]x[0,pi]. Do we agree on that?

I agree, with a minor reservation.  I'm a bit uncomfortable calling [-1,1]x[0,π] a "rectangle": that space represents a circle (hence my little joke in the title of this article).  However, I think I get what you mean - it's a useful way to visualise things for the purpose of considering a uniform distribution of values of c and θ.

What occurs to me is that this can be used in association with the 1/4 method.

My "fix" involved selecting a midpoint from this space (precisely like you seem to be suggesting), while the standard 1/4 method involves selecting from a reduced space.  I think it might be, notionally, a bit like this (think density rather than direct correspondence with values of θ):

I'm not suggesting that these are accurate representations of the shapes corresponding to the 1/3 and 1/4 methods, I just used a triangle for 1/3 and cut out circular chunks for 1/4 because it was convenient.  However, the concept does point towards the notion that the 1/2 and 1/4 methods are missing chords - and where they are missing from.

> because I am not focussed on how we select chords, I am focussed on ensuring that we have ALL chords (and where N is less than infinity, a representative sample of ALL chords).

Can you provide a single example of a chord that you can get with the 1/2 method, but that you cannot get with the 1/3 method?

See above.  Of course I can't point to a single example, which you would clearly realise, but I can (at least conceptually) show that there are fewer chords near the locus with the standard 1/3 and 1/4 methods than there are with the 1/2 method.

> Between -R and -R/2 and R/2 and R, there will be a decrease in the proportion of chords greater than sqrt(3)

You are apparently thinking that "c" should be taken in a predefined direction, and then choose another direction θ. It's not what I said. Just fix some c, once and for all, and then choose a bunch of θ, and then draw the chords corresponding to (c,θ).

So, when c is between, R/2 and R (and between -R and -R/2), the proportion of chords greater than sqrt(3) is 0. So the final answer is 1/2. (Which is absolutely not surprising because it's exactly the same method)

See above.  I think I've already addressed your "once and for all" objection, perhaps once and for all (but I am not holding my breath).

I don't know how you end up with 1/2 with what you've said here, but I do agree that all my methods - the standard 1/2 method, the "corrected" 1/3 method and the " corrected " 1/4 method - are effectively (and exactly) the same method.

---

I note that there might have been confusion about my use of the word "fixed" when I mean "corrected".  When I used "fixed" previously, I did not mean "never to be changed" as in "fixed in stone".  I meant "fixed" as in "my keyboard is broken, I am going to have to get it either fixed or replaced".

1. > It seems to me that you are blending the discussion about the disc and the discussion about the interval.

Yes, that's exactly what I'm doing. They are problems of the same type. If there is "general understanding of what at random means" it should apply to both problems.

> but don't you agree that using such a distribution would not meet the general understanding of "at random"

That's Bertrand's paradox in a nutshell : there is no general understanding of "at random".

> you are talking about a chord that is offset from the locus by c at its midpoint and that, therefore, the direction mentioned is the direction from the locus to that midpoint.

Yes.

> This is not what I thought you meant before.

I thought so too, that's why I explained it again more precisely.

> You'll get an infinite number of chords with precisely the same length but different gradients.

Yes. That's my point. Why would it be foolish ? What makes this method for drawing a representative sample of chords more foolish than your method (which get an infinite amount of chords with precisely the same gradient but different lengths) ?
I honestly cannot see the difference A PRIORI.

> I'm a bit uncomfortable calling [-1,1]x[0,π] a "rectangle"

That's unfortunate. It's standard notation.

> show that there are fewer chords near the locus with the standard 1/3 and 1/4 methods than there are with the 1/2 method.

Yes. And I perfectly agree with this argument. My point is that requiring the probability distribution to give as much chords near the locus than near the boundary is a relatively "arbitrary" requirement.
In the interval example, with the method of selection by uniformly choosing the two endpoints (giving a result of 1/4), there are more subintervals near the middle of the interval.
The only possible "fix", is to choose a completely absurd way of selecting a subinterval : Pick a point P uniformly in the interval, then the selected subinterval is [P,P] (yes, you read that well). And yes, it's the ONLY possible way to avoid getting more subintervals near the middle. I think that we both agree that this is not what "at random" should mean in this context.

So you think that the requirement is natural in the chord problem, only because it gives a sensical result at the end.

As I said, Take a slightly different problem, "pick a straight line segment at random inside the disc (not necessarily a chord)", and tell me A PRIORI if you requirement is sensical in this context.

To sum up :
I agree with mostly all your computations
I'm not very fond of the way you present the results of your computations (saying that you have the "correct" answer to Bertrand paradox, that you "fix" the "wrong" answers, and so on ...).
And I certainly disagree with your apparent opinion that Bertrand's question can be answered unambiguously.

2. > Yes. And I perfectly agree with this argument. My point is that requiring the probability distribution to give as much chords near the locus than near the boundary is a relatively "arbitrary" requirement.

That wasn't entirely my point. What I mean is that the very fact that there are fewer chords near the locus indicates that chords are missing given my goal to identify ALL chords. I can't point to a precise example of a missing chord, but the results do indicate that a set of chords are missing. It might be more accurate (or not) to say that chords with a certain characteristic (close to locus) are less frequently instantiated with the 1/3 and 1/4 methods than with the 1/2 method. Perhaps you think that chords are being counted multiple times in the 1/2 method, in which case I could reverse your challenge and ask you to identify them.

> And yes, it's the ONLY possible way to avoid getting more subintervals near the middle.

I don't have a problem with more subintervals near the middle of the interval. It's what I would expect. I don't expect chords to cluster around the rim of the disc. I'm fully aware that what I expect doesn't dictate what is and what is not mathematically "real". (I don't know how many times I will have to make painfully obvious statements like this.)

> And I certainly disagree with your apparent opinion that Bertrand's question can be answered unambiguously.

In strict mathematical terms, that would appear not to be the case (because in strict mathematical terms the missing chords don't matter, or they are not missing, or their "missingness" is a meaningless concept) - and what you assert as my opinion is not my opinion, given that caveat.

A large part of my interest in the Bertrand Paradox is how it applies to the real world, and from what you seem to be saying, it doesn't. There are no infinite sets that we deal with in the real world - if we are splitting up an interval in the real world, we'd eventually reach a point at which we have the planck length, below which further division appears to be meaningless. Therefore, if I were running a randomised drug efficacy trial, I would not have to worry overly about the mathematical basis of the randomisation due to any peculiarity of mathematics revealed by the Bertrand Paradox. I'd have to worry about biases in my patient selection process and the purity of control groups and so on, but there should be no worries that I should come to two or more perfectly valid yet different results regarding the effectiveness of the drug being trialled. "Megacorp Pharma announced today that the new cancer drug KillCan is 50% effective, 33% effective and 25% effective, depending on how you read the results and mathematicians confirm that all three results, and in fact any other results you want, are equally valid and correct. Soon to be available from you GP." That just won't happen. Or will it?

3. > Perhaps you think that chords are being counted multiple times in the 1/2 method, in which case I could reverse your challenge and ask you to identify them.

No individual chord is counted "multiple times" or is "missed" in any of the presented method. The probability that a "random" chord is exactly the same as a given predefined chord is 0. So all three methods could be seen as "uniform" because the probability of each possible outcome is the same (it is 0)

> I don't have a problem with more subintervals near the middle of the interval. It's what I would expect. I don't expect chords to cluster around the rim of the disc.

"madness is doing the same thing and expecting different results" (sorry, that was too tempting)

I know that it's not exactly the same problem, and I don't think that you are mad, but I would really like to understand why you don't expect the same kind of property in both problems ?

And what about the third problem "pick a random segment inside the disc". What do you expect ?

> "Megacorp Pharma announced today that the new cancer drug KillCan is 50% effective, 33% effective and 25% effective, depending on how you read the results and mathematicians confirm that all three results, and in fact any other results you want, are equally valid and correct"

Wait what ? The different anwer to Bertrand question do not come from the fact that I can "read the results" differently. It comes from the fact that I can read the QUESTION differently.

This makes me think of the famous completely unrelated example :

Imagine that a medical test for a rare disease comes back positive. You know that a test will give the correct answer with a 99% probability. What is the probability that you actually have the disease ?

If you assume that this is a well-defined question, then the only possible answer is 99%. Because if this is a well-defined question, then by your beloved principle of indifference, you should assume that the prior probability that you are sick is 1/2, and an easy computation gives the result.

But, in a real situation this is certainly not a well-defined question ! I'm not given the probability P(sick). So I can not answer the question. The actual "real-life" answer depends on the probability distribution of the disease in the population (which has no reason to be uniform).

In Bertrand paradox it's even worse, because there is no obvious notion of uniform on the set of chords.

1. > No individual chord is counted "multiple times" or "missed" ... probability of each possible outcome is ... 0.

This is where I find my concept of "granularity" useful. If we talk about an infinite set of chords, you are correct (or more accurately, "I agree with you", I don't know that you are correct. I know only that I agree with you and I know that my agreeing with you doesn't make either of us correct - only being correct does that).

But with granularity, choosing an arbitrarily large number N (and arbitrarily small divisions R/N, where R is the radius of the disc in question) allows us to see what the distribution of chords does as N approaches infinity. I think (in my little world, which is not to say that I am suggesting that you ought to think the same as me, heaven forbid) that this is a better way to approach things rather than thinking only of the case where N=∞ and, as you say, each individual chord has a likelihood of 0. But is it precisely 0? This is perhaps more of a philosophical question rather than a mathematical question (since we can just set the definition so that 1/∞=0), but it seems to me that really, if we are talking about N approaching infinity (since in a sense it cannot be reached) then the likelihood of each individual chord is infinitesimal rather than explicitly zero. I do understand that it's more useful/convenient to talk about "infinity" and "zero", but it's a bit of a spherical cow situation.

> "madness is doing the same thing and expecting different results"

I'd suggest that we both suffer from this, in our own ways. We just present recombinations from the set of same things to try to get a different result (a feeling that there is some form of comprehension from the other regarding the actual point being made).

> why (don't you) expect the same kind of property in both problems?

I actually do expect the same kind of property in both problems - the absence of clustering at the edges. For example, I'm aware that by walking away from cartesian co-ordinates in the 1/4 method and instead using polar co-ordinates, I get a nice smooth distribution in the [-1,1]x[0,pi] space (rectangle) - but at the cost of having, in terms of cartesian co-ordinates, a concentration of chords around the locus of the disc. But I expect that (an argument that I know to be of undisputed value to myself and only myself).

> And what about the third problem "pick a random segment inside the disc". What do you expect?

I'd expect to see more of them around the locus than around the rim of the disc. As to exact figures, hm, I'd be tempted to look at segments on a representative sample of chords ...

I don't know about you, but when I see results reported, I often check to see what the exact question was. This is often where the damned lies of statistics sneak in. But I agree, if the question is fundamentally unclear, and there is an attempt to provide an answer without trying to clarify, you're going to have problems.

> ... test ... correct ... with a 99% probability ... probability that you ... have the disease?

I actually saw this sort of thing not terribly long ago in a piece about how medical doctors don't understand statistics (sufficiently well). The issue with your question is that available data simply isn't provided (what you term "P(sick)"), so the answer to the question as asked will have to include reference to the unknown data (for example something like 0.99xP(sick)+0.01xP(not sick)).

This doesn't apply to the chord question (or at least not in the same way).

There is perhaps one last thing that I need to say, because I'm still not sure that you understand my problem with your arguments.

> As to exact figures, hm, I'd be tempted to look at segments on a representative sample of chords ...

"Representative" is a relative notion. It depends tremendously on the particular probability distribution of your chords/segments/whatever that is chosen.

I'll try to make a parallel between a real life example and the chord example. :

Imagine that you take a sample of 1000 idividuals : 200 whites, 200 blacks, 200 asians, 200 hispanic and 200 pacific islander randomly chosen in each group. In everyday language, some people might say that this set is representative of the US population, because every group is represented.
Now imagine you are doing a statistical study on the percentage of the US population which is taller than 180cm. You cannot just take the percentage of your sample and give it as a result, right ? Because there is a bias.
Your sample is absolutely not representative of the distribution of each group inside the US. There are far more white people than pacific islander. So to get a more accurate answer you have to "fix" the computation to give more weight to the white group and less the pacific islander group.

I know you probably already understand that, but now rephrase the whole paragraph a little bit :

Imagine that you take a set of 1000 chords : you fix a direction θ, and you take 1000 values of c, evenly spaced between -1 and 1. In everyday language, one might say that this sample is representative of the chords, right ?
So imagine you are doing a statistical study on the percentage of chords which is longer than sqrt(3) (well, you don't have to imagine, that's exactly Bertrand question). You cannot just take the percentage of your sample and give it as a result, right ? because there COULD BE a bias.
There is no reason to believe that your sample is representative of the actual distribution of chords inside the circle. There could be far more chords with c>0.5 than chords with 0 The issue with your question is that available data simply isn't provided (what you term "P(sick)"), so the answer to the question as asked will have to include reference to the unknown data.

It's exactly the same here !

The issue with Bertrand question is that available data simply isn't provided (what I call the probability distribution on the set of chords), so the answer to the question as asked will have to include reference to the unknown data.

And that's why I'm saying that Bertrand question is not well-defined. That's why some other people say that all answers are acceptable. Because some data is missing from the question. So if you choose a particular data, you get a particular answer.
Hence that why I think it's relatively useless to lose time trying to answer it. Our time will be better spent by teaching people the basics of probability theory, so that they stop asking ill-defined questions ...

3. Again, part of my post disappeared. You have to include the following :

[... c>0.5 than chords with] c in [0,0.5]. So to get an accurate answer you have to "fix" the computation and give more weight to certain group of chords and less weight to some other.

But that's where the problem arises ! You actually said it yourself :
[The issue with your question ...]

Sorry, it's just when I use inequalities, the website seems to believe that it's a html beacon.

4. I get the point that you are making here (or at least I think I do). It's why I've talked about the "set of ALL chords".

When you say (paraphrased) "there could be far more chords with c in [R/2,R] than in [0,R/2]" you are, I presume, assuming a "proper" mathematical circle/disc - we are talking about Euclidean space and not talking about curved space, or anything tricky like that. If so, I'd have to ask, on what basis, other than your selection process for problems like this, can you suggest that we might have more possible chords (and thus more possible lines defined by extending those chords out to infinity) passing through the interval [R/2,R] than any other interval of the same length? The circle/disc under consideration is essentially undefined as far as location, size and rotation go, so we should (reasonably) be able to change the locus and not have our answer change on us - but what you are suggesting is that if we shift the locus up by R, and rotate the circle/disc by π, then we'll change the number of lines passing through the intervals [0,R/2] and [R/2,R]. Ditto if we expand our circle/disc by a factor of 2 while retaining the locus at the notional (0,0).

We could even have two overlapping circles/discs, both of radius R, one with a locus at (0,0), the other with a locus at (0,R). This would mean that you'd simultaneously have more lines passing through [R/2,R] (as defined by chords in the first circle/disc) and more lines passing through [0,R/2] (as defined by chords in the second).

This seems odd to me. Does it not seem odd to you?

5. > "set of ALL chords"

Wow, may be I understand what you mean, but it would be odd.
We have a given circle, right ?
When you say the "set of ALL chords", are you including the chords that are NOT inside the given circle (but inside another circle somewhere else ...) ?
Was that your point all along for repeating "ALL chords" all the time ? It would make sense with the rest of the argument :

> on what basis can you suggest that we might have more possible chords (and thus more possible lines defined by extending those chords out to infinity) passing through the interval [R/2,R] than any other interval of the same length?

Since the beginning, you are thinking of chords as the intersection of a straight line with the disc. That's a great characterization and a good way to get chords. (but not the only one, as we both know)
So if I'm not mistaken, for your point of view, there is an existing set of all straight lines on the entire plane (like an infinite net), and you are just taking the intersection of this existing set of straight lines with a given disc. And you say that if you move the disc around, it will not cross the same straight lines, but the answer to Bertrand question should remain the same. Am I correct to assume that this is more or less your reasoning ?
With this interpretation you are absolutely correct and agree with Jaynes argument. This is a mathematically correct argument.

But that's not the only natural point of view on this problem.

See, I'm taking another characterization for chords. For me a chord is a segment between two points on the circle. So there is nothing "outside" my circle. I have no reason to extend a chord out to infinity. The chords are not intersection of lines with the disc, they are segments inside the disc ! There is no reason to consider objects (lines) that are not chords on the given circle, don't you think ?
So, If I change the locus of the circle in the plane, the chords are moving with it. If I double the size of my circle, then the chords inside it will double their size. If I rotate the circle, the chords will rotate with it. So the final answer to Bertrand question will not change at all.
And with that point of view, it's perfectly natural to have more chords close to the rim than close to the locus. You only think it's odd because you are thinking of an existing "net" of straight lines on the plane, and you place your circle on that existing net of lines. But from my point of view, there is no "net" of existing straight lines.

> We could even have two overlapping circles/discs, both of radius R, one with a locus at (0,0), the other with a locus at (0,R). This would mean that you'd simultaneously have more lines passing through [R/2,R] (as defined by chords in the first circle/disc) and more lines passing through [0,R/2] (as defined by chords in the second).

If I understand correctly, with your point of view, if you have two circles in the position you gave, there is as much chords in the interval [0,R] than in the interval [R,2R], right ? With your point of view, the fact that we have one, two or seven circles, do not change the "density" of chords at all, is that correct ? This seems odd to me.
With my point of view, each circle has its own set of chords, so the "density" of chords will be higher in the intersection of both disc. So there will be "twice as much" chords in the interval [0,R] than in the interval [R, 2R], because there are two set of distinct chords.

6. It seems to me that you are redefining the problem with the problem that is the Bertrand Paradox, away from "the term 'at random' is ill-defined" or "the term 'at random' is well understood, but the probability density is ill-defined" towards "we don't really know what a chord is" - you seem to think they are something you "get" (using some methodology) whereas I tend to think of them as being there inherently in the circle or on the disc and all we do is "identify" them (using some methodology) meaning that some methodologies are going to be better than others at identifying ALL of them. Even with the use of "get", I still think in terms of your methodology (the 1/3 method) not getting them ALL.

As to your apparent major misunderstanding of what I was saying (following a pretty good summary of what I have actually been saying ... up until "But that's not the only natural point of view on this problem") - I might leave that for the moment and put my response into an article - my vested interest in doing that relates to the opportunity to deploy an obvious pun.

4. > It seems to me that you are redefining the problem with the problem that is the Bertrand Paradox, away from "the term 'at random' is ill-defined" or "the term 'at random' is well understood, but the probability density is ill-defined" towards "we don't really know what a chord is"

I'm sorry but I have to be a little bit formal to explain exactly what is, in my humble opinion, Bertrand paradox. And perhaps you will understand that all my arguments are actually going in the same direction ...

Let C be a circle. Denote by Ω(C) the set of chords on the circle C. This is a well-defined set.
I have a random variable X : Ω(C) -> R+ , corresponding to the length of a chord.
Bertrand question is "what is the probability of the event (X >= Rsqrt(3) ) ?"

The question concerns the probability of some event. Which means that I need a probability measure on Ω(C). But nothing in the question tells me which probability measure to take on Ω(C).
So, I could just stop here and answer "I don't have enough information to answer the question".

But perhaps the probability measure is implied by the question ? Perhaps, the person asking the question thought there was an "obvious" probability measure on Ω(C) ?

As far as I can tell, there is no obvious probability measure on Ω(C). It's not a finite set, it's not a subset of a vector space, it's not a manifold, or a quotient of some Lie group, ...
My point is that Ω(C) is not the kind of set where we already have an obvious choice of probability measure (I'm using an authority argument here. As a mathematician, I know for a fact that there is nothing obvious about defining a "nice" probability measure on a given abstract set.)

So this is bad, we don't have an obvious probability measure on Ω(C). So I could just stop here and say "I don't have enough information to answer the question".

But maybe a chord in Ω(C) can be described in an obvious way by some parameters, and then I could take an obvious probability measure on the space of parameters, and this would give me an obvious probability measure on Ω(C) ?

That's great because there is an obvious bijection between Ω(C) and C^2 (chord -> endpoints )
But wait, there is also an obvious bijection between Ω(C) and [0,pi[x[-R,R] (chord -> (θ,c) )
Oh and there is also an obvious bijection between Ω(C) and D, where D is the disc (chord -> midpoint)

Note that a bijection is the same thing as a description (or parametrization), it gives me an unambiguous way to describe a given chord with some parameters. I think it's also what you mean by "identifying a chord"

And fortunately there are also obvious probability measures on [0,pi[x[-R,R], on C^2, and on D.
So you can get "obvious" probability measures on Ω(C), by simply taking the pushforward-measure of these uniform probability measures.

The problem is that different bijections/parametrizations/descriptions will give different probability measures on Ω(C).

For me, that's the point of Bertrand paradox :
Two different ways of describing chords will give two different parameter spaces and hence two different "obvious" probability measures, and in the end it will give two different answers.

1. I have responded to your previous comment, as promised, here.

Sorry about the delay, I've had other things happening. As for this comment:

I don't understand the precise meaning of "I have a random variable X : Ω(C) -> R+ , corresponding to the length of a chord." I mean, I can guess, but perhaps there is some esoteric detail that I'd be missing in my guess. This might be of no import, because we might have a problem even before we get there. In the previous line, you wrote:

"Let C be a circle. Denote by Ω(C) the set of chords on the circle C. This is a well-defined set." I'm presuming that your "set of chords" here and my "set of ALL chords" are the same. If not, it's not a well-defined set, since we can disagree about what constitutes the "set of chords" (perhaps we could disagree about whether we can disagree). But moving on ...

You then leap into a "probability measure" question which leads into the "parameter space" question, which I think I have, at least in part, addressed in my response to your earlier comment (here). On the way there though, you pass through "chord -> endpoints" etc, which I don't think I disagree with. If anything, I think I disagree with "endpoints -> chord", but it depends on precisely what you mean by "->".

Put it this way: "the set of (ALL) chords implies some set of pairs of endpoints" - I am happy with that. What I am thinking is that that does not necessarily mean as a consequence: "some set of pairs of endpoints implies the set of (ALL) chords" or even "the set of (ALL) pairs of endpoints implies the set of (ALL) chords". The very fact that we reach different answers depending on our parameterisation seems to suggest that a claim that "the set of (ALL) chords implies the set of (ALL) pairs of endpoints" might be problematic, doesn't it?

5. > I think I disagree with "endpoints -> chord", but it depends on precisely what you mean by "->".

I mean precisely that we have a function, from the set of pairs of endpoints to the set of chords (the obvious one, which associate to each pair of endpoints, the unique chord between these points). And I claim that this function is a bijection.

Which means that you can get ALL chords from their endpoints. "ALL" as in "all of them". Not a single chord will be missing. It's a complete and perfect bijection (this is a redundantly redundant phrasing) between the set of pairs of endpoints and the set of chords.

If you don't understand that fact or disagree with it, then I'm seriously wondering what you understand of Bertrand paradox. If you really think that the problem of Bertrand paradox is that we are "missing" some curves in the 1/3 method, then you are so far away from understanding, that it's a little bit scary.

> The very fact that we reach different answers depending on our parameterisation seems to suggest that a claim that "the set of (ALL) chords implies the set of (ALL) pairs of endpoints" might be problematic, doesn't it?

It's only problematic if you think that Bertrand's question is well-defined and/or has a unique answer, doesn't it ?

The very fact that you are asking this question seems to suggest that you didn't understand anything at all about the 15 different answers I already gave you. And also that you don't understand that much Bertrand paradox, Jaynes answer and probabilities in general. I'm sorry to be harsh but that's really what I get from your last two answers.

I was trying to give you the benefit of the doubt for too long, but it's a little bit discouraging ... I'm not sure I want to continue this discussion. You obviously don't want to learn the language of probability (or even basic mathematical language), and this makes every argument a nightmare :
When I try to be precise you say that you don't understand (e.g. "random variable"), and when I'm a little bit imprecise then you are nitpicking on my choice of words (the use of "get" in a previous answer) ...

So I quit ...

1. I'm sorry that this has been frustrating, but please understand that I find this a little frustrating at times too. For example, I said in the comment above "I don't understand the precise meaning of 'I have a random variable X : Ω(C) -> R+ , corresponding to the length of a chord.'" You came back implying that I did not understand "random variable". What? This wasn't the bit I didn't understand, so perhaps it's my fault because I was unclear. What I didn't understand, precisely, is the statement "X : Ω(C) -> R+". Let me try to flesh it out, and perhaps then you can confirm what I think or point out where I have erred.

To me " random variable X : Ω(C) -> R+" (in context) seems to mean "X is a random chord C taken from the set of chords Ω in a circle/on a disc with a radius of R". I have no idea why you have the + sign in there, nor how this expression " X : Ω(C) -> R+" says precisely and explicitly that we are talking about a circle or radius R, and that C is a chord.

It seems (to me) that what this is just a way of expressing the notion that we have talked about in a shorthand way that won't have an immediately obvious meaning to anyone who hasn't been involved or following our discussion. For this reason, it smacks (to me) of obfuscation rather than clarification.

> It's only problematic if you think that Bertrand's question is well-defined and/or has a unique answer, doesn't it?

I don't think so. You yourself have implied that all the methods result in all chords, that there are no "missing chords". If we have all the chords, we can ask "what proportion of these chords are of length greater than √3R?" and we should not be able to come to different answers, because we have all the chords, right? The only way we could then arrive at a different answer when asking "what is the probability of choosing at random from the set of all chords a chord of length greater than √3R?" is if some chords are more likely to be chosen than others. As I understand it, you are not saying that.

How about another question: What is the average (mean) width of a circle? Does this question have three or more valid answers?

2. > " random variable X : Ω(C) -> R+" (in context) seems to mean "X is a random chord C taken from the set of chords Ω in a circle/on a disc with a radius of R". I have no idea why you have the + sign in there, nor how this expression " X : Ω(C) -> R+" says precisely and explicitly that we are talking about a circle or radius R, and that C is a chord.

See ... that's exactly what I mean ...
You don't even know what a random variable is. You have to realize that it's the standard formulation for modern probability theory, that you will most probably find on the first pages of any probability course (or on wikipedia : https://en.wikipedia.org/wiki/Random_variable#Definition).

> If we have all the chords, we can ask "what proportion of these chords are of length greater than √3R?" and we should not be able to come to different answers, because we have all the chords, right?

No. Not right.

This sentence settles it, you really don't understand Bertrand paradox at all. Because this is EXACTLY the point of Bertrand paradox. This is exactly why it has been called a "paradox". I cannot even understand what you think you are doing with these blog posts.

Do you have a general method to compute a proportion in an infinite set ?
I know how to compute a proportion in a finite set : Take the sum of the objects that have the desired property and divide by the total number of elements in your set.
I have absolutely no idea what it could mean in an infinite set ... but apparently you do, so I'm curious.

> What is the average (mean) width of a circle? Does this question have three or more valid answers?

The question is ill-defined, a crucial information is missing. Exaclty the same type of information as in Bertrand question.

3. I think you might be suffering from an expertise problem. If this were all so simple and obvious (ie the meaning of X : Ω(C) -> R+, which is still not clear), then there would be no requirement for Doctors of Mathematics. We'd just flip open a probability text book to the first page and it would all become clear.

> This sentence settles it, you really don't understand Bertrand paradox at all. Because this is EXACTLY the point of Bertrand paradox. This is exactly why it has been called a "paradox"

I have, at the very least, a surface understanding of the Bertrand paradox. Perhaps there is something more to it that I am missing, but it would appear that it requires training to get to the level of comprehension.

> Do you have a general method to compute a proportion in an infinite set?

It might not be what you want to hear, but I think my concept of granularity does give me that. I've explained it before, I'm not going to waste characters to explain again.

> The question is ill-defined, a crucial information is missing.

So you say. However, I do have a method, using only the information provided. We have a circle, the circle has a radius (R). The circle has a notional length, 2R. The average (mean) width will be the width of a rectangle of length 2R that has the same area as the circle. That makes the average (mean) width pi.R/2. I'm sure you can check this logic and see that my numerical answer is correct, based on the assumption. And guess what ... the average length of chords produced using the 1/2 method is ... pi.R/2. (Out of curiosity, I worked out the average length of chords using the 1/3 method as well. In that case it is 4.R/pi - I don't know why - but perhaps you can tell me and also explain the geometry that leads to such a result.)

4. > However, I do have a method, using only the information provided.

I'm super sorry, I read your question too fast the first time and I misinterpreted it. So you can forget about the last line of my previous comment.

Now that I understand the question, I have an answer, completely different than yours :

First, you can see that the term "mean width" (https://en.wikipedia.org/wiki/Mean_width) has a precise meaning. Even if you don't understand the wiki page, it's just to prove that mathematicians use that term
With this definition, the mean width of the circle is 2R. Because the circle is a https://en.wikipedia.org/wiki/Curve_of_constant_width

Now before saying anything, I understand that it was not what you meant when you talked about "mean width". And I understand your reasoning and computations, which seem to be correct.

But this actually gives me perfect example for one of my argument :

The meaning of the question was probably very clear in your mind.
The meaning of the question was also very clear in my mind (once I understood that you were not talking of a random circle). Yet it was a different meaning, because unfortunately the term "mean width" has a precise meaning for mathematicians.

Is my answer wrong ? No, my answer is the correct answer for the definition of "mean width" that I (and wikipedia) had in mind.

Is this a mathematical paradox ? Absolutely not. It's just an ambiguity of english/mathematical language. The same word can be interpreted in different ways. If you want to avoid this kind of problems, you have to ask more precise questions. (or use the correct words)

Now, does this apply to Bertrand paradox and your countless arguments for the 1/2 method ? Hell Yes ! The ambiguous term is "pick at random".

Apparently, when you read "pick a chord at random" you have a very clear idea of what it means. And with that meaning in mind, the 1/2 answer is probably the correct one.

Unfortunately, the term "pick at random" has a meaning in mathematics, and the precise definition passes through random variable. And with this meaning, the answer depends on a choice of a probability distribution on the set of chords.

That's bad because it means that people that know a little bit of probability will probably have a different answer than yours because they interpret "at random" differently. That does not mean that you or they are wrong, it just means that you are not answering the same question.

Now the question that I invite you to think about is : "How is it possible that mathematicians never invented a general definition of "at random" that would give a unique answer to Bertrand question ?".

And I shall remind you your own words :

> Perhaps there is something more to it that I am missing, but it would appear that it requires training to get to the level of comprehension.

So train !!

5. I do understand the argument that the mean width of a circle is 2R, although I would personally see that as being the answer to the question "what is the mean maximum width of shape S?" - a question that seems bizarre in terms of a circle, but perhaps is not so odd in terms of an irregular shape (I had to come back to this and point out that I wrote it before I looked at Wikipedia with it’s irregular shape).

However, this isn't where I was going with this question. I thought about putting something together to clarify what I meant and it seems that I may have to.

6. Ok, I've tried to explain - here - it's not rigorous, by the way, but hopefully you can see what I am getting at.

Feel free to comment, but play nicely!

Sadly, the unremitting attention of a spambot means you may have to verify your humanity.