You and a stranger are imprisoned and put into separate rooms with no way to communicate with each other. Police officers bargain with you and the stranger separately and simultaneously.
You listen to the bargaining thus conclude that you have these options:
– if you confess and the stranger confesses, you both are imprisoned for 3 years
– if you confess and the stranger does not confess, you are imprisoned for 1 year and the stranger for 10 years
– if you deny and the stranger confesses, you are imprisoned for 10 years and the stranger for 1 year
– if you and the stranger both deny, you both are imprisoned for 2 years
You are given the choice to confess or deny the crime. What would you do..?
This information can be shown in a payoff matrix (as shown in the image above) and from this we can clearly see that the globally optimal solution occurs, for both you and the stranger, when you both deny (2 years in prison each). However, the worst case scenario also occurs for you when you deny (10 years in prison).
On the other hand, if you confess, you can potentially get the lowest sentence in prison (1 year) or 3 years. Choosing between 2 or 10 years, or, 1 or 3 years, it is obvious to choose 1 or 3 years (thus confessing) in order spend fewer years in prison.
Yet, this seems somewhat stupid when you both know that the globally optimal scenario is when you both deny. However, when considering the fact that there is no trust built between you and the stranger, rational thinking leads to the Nash Equilibrium State How? Well, because you chose according to the choices of the other party. As you don’t know what the stranger will do, it is always best for you to confess.
(What is the Nash Equilibrium? A stable state in a system that involved several interacting participants in which no participant can gain by a change of strategy as long as all the other participants remain unchanged (Princeton definition))
Imagine you are doing this for an infinite number of times – what is the best strategy to employ? Well, relate it to real life. Start of by cooperating (deny so that you and the stranger have a chance of getting the optimal solution), but if they do not cooperate always confess.
For a step by step explanation, watch this video:
So, now that we understand the prisoner’s dilemma, let’s try to generalise it. The four situations produce four different outputs Reward (3 years), Punishment (2 years), Temptation payoff (10 years), and, Sucker’s payoff (1 year). Thus, T>R>P>S for a Prisoner’s Dilemma.
Now, you may be wondering what the point is for all this other than games. In fact, the applications of this include an arms race between two countries, political sciences, evolutionary biology, economics, and, philosophy. It is the decisions and strategies in games that, when represented mathematically, allows us to understand decision making.
Stay updated for the next post by following the official Instagram @bitnibblebyteblog
Ada Knowe 🙂