In Part 1, I mentioned an interesting article – "Generosity leads to evolutionary success" – which is largely based on a paper by Alexander
Stewart and Joshua Plotkin, "From extortion to generosity, the evolution of zero-determinant strategies in theprisoner’s dilemma". The Stewart&Plotkin paper was, again
largely, a response to what strikes me as a somewhat more technical paper by
William Press and Freeman Dyson, “Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent” (this
latter paper certainly seems less accessible to a lay reader such as myself,
irrespective of how technical it is).
-----------------------------
Both
Press&Dyson and Stewart&Plotkin make reference to evolution, but when
they do so, they mean different things.
Press&Dyson
are talking about the evolution of a strategy by a single “evolutionary
opponent”, such that this opponent will move towards a strategy that maximises
their score. This can be equated to a
situation in which I as a stock trader might modify my buy and sell strategies
over a period of time until I consistently get the best possible income. Every now and then, I could use a slightly
different combination of decision parameters for a couple of sessions and
compare the outcome against previous sessions, if I do better with the new
combination of decision parameters then I can make this combination my new
strategy, but if I do worse, I can keep my original strategy. I’m not evolving, but my strategy will. I would be an “evolutionary trader”, rather
than an “evolving trader”.
Stewart&Plotkin,
however, are talking about the evolution of a population as one strategy
prevails over another. This would be as
if a group of traders were able to see how others in the group did with their
strategies and were willing to adopt the more successful strategies in place or
their own, or if traders with more successful strategies could employ more new
junior traders who would use the same successful strategies and later employ
yet more juniors. Over time, the
dominant strategies (most populous) would be those that are most successful.
I
did some very simple modelling of this latter approach and discovered something
that I found rather interesting.
Press&Dyson
provided a “concrete example” of extortion in which the extortionist (E) responds
to cooperation and defection on the part of their opponent (G) in a
probabilistic way (as discussed in Part 1, p1=11/13,
p2=1/2, p3=7/26 and p4=0).
To
maximise her score against an extortionate strategy like this, the G strategy
player must always cooperate (so q1=q2=q3=q4=1/1). Therefore while G and E strategy players face
off against each other, the E wins.
However,
in a mixed population in which a player might play against either a G strategy
player or an E, what Stewart&Plotkin found is that when a G strategy player
meets another G (assuming they retain their always cooperate strategy), they’ll
reap sufficient rewards from mutual cooperation as to mitigate the losses that
follow from playing against an occasional E strategy player. However, when an E strategy player meets another
E, they are quickly locked into mutual defection and obtain a low score. Therefore, to get a good score, an E strategy
player needs to meet a G strategy player, while a G strategy player gains no
benefit from an E strategy player and benefits only from meeting another G
strategy player. For this reason, a
population of mostly E strategy players becomes self-limiting, while a
population of mostly G strategy players will grow.
The
population of E strategy players will tend to only grow well at the border with
the population of G strategy players, which is part of why smaller populations
favour extortion – smaller things having proportionally greater surfaces than
larger things, the surface to volume ratio decreases as the size increases
(this is why large creatures have problems to deal with in hot climates and
small creatures tend to struggle in cold climates).
I modelled
this in a spreadsheet (using the strategy adoption approach) and found that
generous strategies did indeed expand at the expense of extortionate
strategies, so long as a few provisos were met:
·
the population had to be largish – I used
N=400
·
each pairing had to run more than two
iterations of the Prisoners’ Dilemma (PD)
·
there had to be some inclination to change
strategy
This
last proviso might seem a bit strange, because it seems obvious that to evolve
a population must have at least a little inclination to change. What I found though was that the magnitude of
the inclination to change had a huge effect not so much on the outcome, but on
how rapidly the outcome manifested and to what extent. An inclination to change of 5% with three
iterations of the PD left approximately 5% of the players using the extortion
strategy after 50 rounds. With 300
iterations of the PD, with an inclination to change of 5%, there were on
average slightly fewer players using the extortion strategy. Even with 3x10^30 iterations, with all else
held constant, there were still just under 5% of players being extortionate.
Make
that an inclination to change of 10% with 300 iterations and after about 50
rounds almost all extortionate players are gone. Increase the inclination to change to 20% and
the extortion strategy players are gone after 30 rounds.
I’d
interpret an overly high inclination to change as being bad, since it will wipe
out variation in strategies where variation might be necessary to adapt to
future environments which don’t favour a currently successful strategy.
The
second proviso, about how many iterations of the PD are required for the
generous strategy to prevail corresponds well with the idea that our morality
breaks down when times are bad. In other
words, when things are peaceful and stable, then each member of a society will
interact repeatedly with various other members of the society. The iterations of interaction indicate that
generous win-win exchanges will predominate.
But when each interaction may well be the last, there are no iterations
to speak of, so extortion will be fostered.
I
was also interested in latent tendencies, by which I meant an inclination to be
extortionate or generous. Unfortunately,
I don’t have the time or resources to properly model it, but I suspect that if
members in a society have a certain inclination towards extortion strategies,
they will tend to predominate given occasional “bottlenecks” or periods of “bad
times”. As discussed in Part 1, the feeling that things are bad can lead people to act
less generously. Some people, when times
seem bad, will flip right over from “less generous” to “outright extortionate”
and in general people will have a certain amount of tolerance with respect to the difficulty of
the times. Those who have little
tolerance will become extortionate when it is not appropriate (and thus become criminals) while those with too much tolerance will suffer at the hands
of others when things really are tough (or at the hands of criminals even in reasonably good times).
An
interesting question is how we could use this insight to better our societies. Personally I think we need to do at least two
things. Firstly, we need to do what we
are pretty sure will promote generosity – by encouraging people to see that we
are in a time of plenty and that we are part of a large, inclusive
population. Secondly, we need to
dissuade extortionate behaviour – by making the cost of not cooperating, of not
being generous, high enough to make cooperation the best option even in one-off
interactions.
Doing
this without damaging ourselves in the process is the difficult part.
No comments:
Post a Comment
Feel free to comment, but play nicely!
Sadly, the unremitting attention of a spambot means you may have to verify your humanity.