In Part 1, I mentioned an interesting article – "Generosity leads to evolutionary success" – which is largely based on a paper by Alexander Stewart and Joshua Plotkin, "From extortion to generosity, the evolution of zero-determinant strategies in theprisoner’s dilemma". The Stewart&Plotkin paper was, again largely, a response to what strikes me as a somewhat more technical paper by William Press and Freeman Dyson, “Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent” (this latter paper certainly seems less accessible to a lay reader such as myself, irrespective of how technical it is).
Both Press&Dyson and Stewart&Plotkin make reference to evolution, but when they do so, they mean different things.
Press&Dyson are talking about the evolution of a strategy by a single “evolutionary opponent”, such that this opponent will move towards a strategy that maximises their score. This can be equated to a situation in which I as a stock trader might modify my buy and sell strategies over a period of time until I consistently get the best possible income. Every now and then, I could use a slightly different combination of decision parameters for a couple of sessions and compare the outcome against previous sessions, if I do better with the new combination of decision parameters then I can make this combination my new strategy, but if I do worse, I can keep my original strategy. I’m not evolving, but my strategy will. I would be an “evolutionary trader”, rather than an “evolving trader”.
Stewart&Plotkin, however, are talking about the evolution of a population as one strategy prevails over another. This would be as if a group of traders were able to see how others in the group did with their strategies and were willing to adopt the more successful strategies in place or their own, or if traders with more successful strategies could employ more new junior traders who would use the same successful strategies and later employ yet more juniors. Over time, the dominant strategies (most populous) would be those that are most successful.
I did some very simple modelling of this latter approach and discovered something that I found rather interesting.
Press&Dyson provided a “concrete example” of extortion in which the extortionist (E) responds to cooperation and defection on the part of their opponent (G) in a probabilistic way (as discussed in Part 1, p1=11/13, p2=1/2, p3=7/26 and p4=0).
To maximise her score against an extortionate strategy like this, the G strategy player must always cooperate (so q1=q2=q3=q4=1/1). Therefore while G and E strategy players face off against each other, the E wins.
However, in a mixed population in which a player might play against either a G strategy player or an E, what Stewart&Plotkin found is that when a G strategy player meets another G (assuming they retain their always cooperate strategy), they’ll reap sufficient rewards from mutual cooperation as to mitigate the losses that follow from playing against an occasional E strategy player. However, when an E strategy player meets another E, they are quickly locked into mutual defection and obtain a low score. Therefore, to get a good score, an E strategy player needs to meet a G strategy player, while a G strategy player gains no benefit from an E strategy player and benefits only from meeting another G strategy player. For this reason, a population of mostly E strategy players becomes self-limiting, while a population of mostly G strategy players will grow.
The population of E strategy players will tend to only grow well at the border with the population of G strategy players, which is part of why smaller populations favour extortion – smaller things having proportionally greater surfaces than larger things, the surface to volume ratio decreases as the size increases (this is why large creatures have problems to deal with in hot climates and small creatures tend to struggle in cold climates).
I modelled this in a spreadsheet (using the strategy adoption approach) and found that generous strategies did indeed expand at the expense of extortionate strategies, so long as a few provisos were met:
· the population had to be largish – I used N=400
· each pairing had to run more than two iterations of the Prisoners’ Dilemma (PD)
· there had to be some inclination to change strategy
This last proviso might seem a bit strange, because it seems obvious that to evolve a population must have at least a little inclination to change. What I found though was that the magnitude of the inclination to change had a huge effect not so much on the outcome, but on how rapidly the outcome manifested and to what extent. An inclination to change of 5% with three iterations of the PD left approximately 5% of the players using the extortion strategy after 50 rounds. With 300 iterations of the PD, with an inclination to change of 5%, there were on average slightly fewer players using the extortion strategy. Even with 3x10^30 iterations, with all else held constant, there were still just under 5% of players being extortionate.
Make that an inclination to change of 10% with 300 iterations and after about 50 rounds almost all extortionate players are gone. Increase the inclination to change to 20% and the extortion strategy players are gone after 30 rounds.
I’d interpret an overly high inclination to change as being bad, since it will wipe out variation in strategies where variation might be necessary to adapt to future environments which don’t favour a currently successful strategy.
The second proviso, about how many iterations of the PD are required for the generous strategy to prevail corresponds well with the idea that our morality breaks down when times are bad. In other words, when things are peaceful and stable, then each member of a society will interact repeatedly with various other members of the society. The iterations of interaction indicate that generous win-win exchanges will predominate. But when each interaction may well be the last, there are no iterations to speak of, so extortion will be fostered.
I was also interested in latent tendencies, by which I meant an inclination to be extortionate or generous. Unfortunately, I don’t have the time or resources to properly model it, but I suspect that if members in a society have a certain inclination towards extortion strategies, they will tend to predominate given occasional “bottlenecks” or periods of “bad times”. As discussed in Part 1, the feeling that things are bad can lead people to act less generously. Some people, when times seem bad, will flip right over from “less generous” to “outright extortionate” and in general people will have a certain amount of tolerance with respect to the difficulty of the times. Those who have little tolerance will become extortionate when it is not appropriate (and thus become criminals) while those with too much tolerance will suffer at the hands of others when things really are tough (or at the hands of criminals even in reasonably good times).
An interesting question is how we could use this insight to better our societies. Personally I think we need to do at least two things. Firstly, we need to do what we are pretty sure will promote generosity – by encouraging people to see that we are in a time of plenty and that we are part of a large, inclusive population. Secondly, we need to dissuade extortionate behaviour – by making the cost of not cooperating, of not being generous, high enough to make cooperation the best option even in one-off interactions.
Doing this without damaging ourselves in the process is the difficult part.