close
The Wayback Machine - https://web.archive.org/web/20101029122340/http://barefootbum.blogspot.com/search/label/game%20theory
Showing newest posts with label game theory. Show older posts
Showing newest posts with label game theory. Show older posts

Tuesday, October 26, 2010

Game theoretic paradoxes

Arrow's Impossibility Theorem
Liberal Paradox

Intuitively, I don't find these two paradoxes especially surprising: we know from the Prisoner's Dilemma that local game-theoretic decision procedures don't necessarily lead to global Pareto optimality. Which is to say, I suspect, any social system must employ higher-level, abstract elements such as constitutions, contracts, and "altruism" or friendliness (where individuals' preferences substantially and directly positively value other individuals' well-being). Furthermore, since there are no deterministic solutions at any level, these abstract elements must evolve rather than being imposed analytically.

Saturday, July 10, 2010

Kantian moral sense and social evolution

A social "game" such as driving is a Prisoner's Dilemma game: If everyone drives safely, everyone is better off, but if everyone else is driving safely then there is an additional benefit to driving dangerously (and the costs of driving dangerously are externalized to the other drivers). Even if they prefer to drive safely, and even if they drive safely because that's supposedly the "right thing to do" in a Kantian sense, the negative consequences of driving safely while others are driving dangerously with no immediate consequences tend to select against safe driving.

Adding surveillance and enforcement (traffic tickets, governors on cars) changes the driving game to a win-win game: even those who would prefer to drive dangerously drive safely to avoid the immediate negative consequences, which should outweigh the positive benefits (e.g. getting to one's destination more quickly).

On the one hand, adding surveillance and enforcement would seem to undermine developing a Kantian moral sense, i.e. driving safely because it's the "right thing to do" rather than because it's beneficial, or driving safely because one directly prefers the mutual benefit to an exploitative benefit. Once we implement surveillance and enforcement, we cannot determine if any individual is driving safely for the "right" reasons (duty or mutual benefit) or the "wrong" reasons (avoidance of immediate consequences). In one sense, we simply don't care why anyone is driving safely; we'd rather have people driving safely for the "wrong" reasons than not driving safely at all. But in another sense, a surveillance and enforcement mechanism tends to develop a Kantian moral sense by selection.

Assume that one's preferences about driving exhibit heritable variation. Therefore, there will be three "memes" present in the population with some distribution:
  1. Directly prefers to drive safely (Kantian good)
  2. Directly prefers to drive unsafely, but indirectly prefers to drive safely because of enforcement (selfish good)
  3. Directly prefers to and actually does drive unsafely (bad)

It should be clear that variation in these memes is step-wise: It takes one "variation" to move from 1 to 2, 2 to 3 (or the reverse direction); it takes two "variations" to move from 1 to 3 or 3 to 1.

Without enforcement, there is of course no case 2, and thus there is no direct selection pressure against preferring to drive dangerously. Furthermore, since there is no direct selection pressure against driving dangerously, there will be selection pressure against driving safely: if there are enough people driving dangerously, driving safely just slows you down without reducing the risk of accident or injury. (Driving dangerously externalizes the risk of accident to all the drivers, not just oneself; accident or injury thus does not differentially select against driving dangerously.) Since there's a differential selection pressure against safe driving, we could expect that without enforcement almost everyone will drive dangerously. Anyone who'd driven in a country without strict traffic enforcement will immediately empirically confirm this prediction.

Adding enforcement, then, creates an immediate selection pressure against actually driving dangerously (assuming that traffic tickets exert a mimetic selection pressure), and one must of course first prefer to drive dangerously to actually drive dangerously. Therefore, the meme for preferring to and actually driving dangerously will be selected against.

Directly preferring to but not actually driving dangerously will not, of course, be selected against. However, this meme will vary in two directions: directly preferring to drive safely, and preferring to and actually driving dangerously. Assuming that variation is uniform (the probabilities of each meme varying to either adjacent meme is the same), then we will end up with an equal number of Kantian and selfish good drivers, with a smaller number of bad drivers.

If the variation is even a little bit non-uniform (i.e. selfish good drivers are more likely to become Kantian good drivers than bad drivers, and Kantian good drivers are less likely to become selfish good drivers) then the prevalence of Kantian good drivers becomes proportionally higher. We can expect non-uniform variation, because internal rational consideration itself acts as a selection mechanism.

In a non-Prisoner's Dilemma type game, however, selection pressures do not act so obviously.

In win-win games (where there's no tension between mutual and selfish benefit) we cannot distinguish between a Kantian and selfish moral good; everyone will do the right thing, and we can't determine if they do so because it's because "the right thing to do" or because it's in their immediate "selfish" benefit. In lose-lose games, a preference for mutual benefit does not apply, and it's difficult to make the case that everyone always losing should ever be a Kantian moral duty.

(Consider, for example, a proposed Kantian moral duty to express one's sexuality only by heterosexual intercourse. The detriment to some homosexual person who suppresses his natural sexuality to conform to this Kantian ideal seems obvious, and not only is there no benefit to me — I don't care how or with whom some stranger has sex — but because I value other people's happiness, this ideal poses a loss to me. The game is lose-lose; the opposite game — have safe, sane and consensual sex in any manner and with anyone you desire — is win-win. One reason I dislike Kantian ethics (even though it can be "fixed up" by reinterpreting a Kantian moral duty as preferring a mutual benefit over an unstable exploitative benefit) is precisely that the discourse of the "right thing to do" independent of any benefit lends some credence to the establishment of lose-lose games as Kantian duties. Indeed if the "right thing to do" really were independent of any sense of benefit, then there must be at least one moral duty beneficial to no one.)

Where we get into some complexity in a evolution/selection analysis is where — as in academic ethics, or professional football — we have a more abstract game which decides between two different zero-sum games. In this case, we must not just select against those who play the "wrong" game, but also not select as strongly against those who play the "right" zero-sum game and lose. Or, better yet, transform the game into one of mutual benefit, or create a more abstract game where the mutual benefits outweigh the losses from the less abstract game.

Surveillance and moral development

Bruce Schneier directs us to Emrys Westacott's Philosophy Now article: Does Surveillance Make Us Morally Better?. Westacott's article displays the usual confusion and problems with a Kantian approach to morality.

Westacott describes his interpretation of Kantian morality:
According to Kant, our actions are right when they conform to the moral rules dictated to us by our reason, and they have moral worth insofar as they are motivated by respect for that moral law. In other words, my actions have moral worth if I do what is right because I want to do the right thing. If I don’t steal someone’s iPod (just another kind of Apple, really) because I think it would be wrong to do so, then I get a moral pat on the back and am entitled to polish my halo. If I don’t steal the iPod because I’m afraid of getting caught, then I may be doing the right thing, and I may be applauded for being prudent, but I shouldn’t be given any moral credit.
On this account, becoming morally better does not entail learning to do the right thing (whatever that might happen to be) but learning to have the correct motivation for doing the right thing.

Westacott gives us two scenarios to compare and contrast the value of surveillance. First, he presents an elaborate scenario escalating surveillance of driving habits, ending with practically indefeasible surveillance and enforcement of traffic laws. He finds the outcome amenable,
At the end of the process, there are no more tearaways or drunk drivers endangering innocent road users. Driving is more relaxing. There are fewer accidents, less pain, less grief, less guilt, reduced demands on the health care system, lower insurance premiums, fewer days lost at work, a surging stock market, [not to mention whiter teeth and relief from the heartbreak of psoriasis] and so on.

Westacott, however, worries that while "increased surveillance may carry certain utilitarian benefits, but the price we pay is a diminution of our moral character. ... [S]urveillance... stunts our growth as moral individuals." "We give up pursuing the holy grail of Kant’s ideal, and settle for a functional but uninspiring pewter mug." He realizes, however, that these worries are, at least in this case, probably misguided: The "inconceivability of most kinds of wrongdoing is a platform we want to be able to take for granted, and surveillance is a legitimate and effective means of building it. So, far from undermining the saintly ideal, surveillance offers a fast track to it."

Westacott wants to dig deeper, though, and presents an alternative scenario where the pragmatic outcome is not so clear, comparing two colleges with different responses to academic cheating:
For instance, imagine you are visiting two colleges. At Scrutiny College, the guide proudly points out that each examination room is equipped with several cameras, all linked to a central monitoring station. Electronic jammers can be activated to prevent examinees from using cell phones or Blackberries. The IT department writes its own cutting-edge plagiarism-detection software. And there is zero tolerance for academic dishonesty: one strike and you’re out on your ear. As a result, says the guide, there is less cheating at Scrutiny than on any other campus in the country. Students quickly see that cheating is a mug’s game, and after a while no-one even considers it.

By contrast, Probity College operates on a straightforward honour system. Students sign an integrity pledge at the beginning of each academic year. At Probity, professors commonly assign take-home exams, and leave rooms full of test takers unproctored. Nor does anyone bother with plagiarism-detecting software such as Turnitin.com. The default assumption is that students can be trusted not to cheat.

Which college would you prefer to attend? Which would you recommend to your own kids?
Presumably, Westacott believes we would endorse Probity College. He offers two additional scenarios — surveillance at work and surveillance of one's children — which make the same point: we intuitively believe that there are situations where that inculcating the right "moral" attitude is much more important than actually enforcing the correct behavior.

As long-time readers will know, I've discussed some deep problems with this Kantian account of morality.

In what sense is the right thing to do different from the beneficial (at least in some sense of "beneficial") thing to do? If these two accounts really were different, then there must be something that is right without being beneficial in any sense, an attitude I flatly reject on humanist grounds: the humanist good by definition is what is in some sense beneficial to human beings. If the right thing to do is equivalent to some sense of benefit, then how can we determine whether anyone us acting because it's the right thing to do rather than because of the benefit? And even if someone were to do the do the right thing because it's the right thing to do, there must be some subjective benefit: they are satisfying their desire to do the right thing because it is right.

Of course, a bit of charity can fix some problems in the Kantian view. Specifically, one could interpret the view as deprecating certain kinds of benefits, such as short-term, individualistic benefits and the avoidance of negative consequences, while promoting other kinds of benefits, such as long-term mutual benefit and the emotional benefit of adherence to duty. Neither Kant nor his interpreters are completely stupid (Kant himself made an important contribution to scientific cosmology). The problem is that it's just as much work constructing an exegesis that makes Kant accurate as it would be to construct a more accurate moral philosophy (while still acknowledging Kant's contributions as important groundwork).

Westacott commits an intellectual sin all too common in philosophy: he presents a dichotomy without giving us much of a framework for resolving that dichotomy. Indeed he draws only the conclusion that "not just that Kant may have a point, but that most of us implicitly recognize this point." But why does academic surveillance intuitively retard our moral development, while traffic surveillance not only fail to retard our "moral development" but actually promote it? As Hamlet noted,
Assume a virtue, if you have it not.
That monster, custom, who all sense doth eat,
Of habits devil, is angel yet in this,
That to the use of actions fair and good
He likewise gets a frock or livery,
That aptly is put on. Refrain to-night,
And that shall lend a kind of easiness
To the next abstinence: the next more easy;
For use almost can change the stamp of nature,
And either curb the devil, or throw him out
With wondrous potency.
Why should it be good to assume the virtue of safe driving while bad to assume the virtue of academic probity?

One approach is (unsurprisingly) game theory. Safe/Dangerous driving is a true Prisoner's Dilemma/Snowdrift/Chicken game. If everyone drives safely, then everyone is better off: we have laminar traffic flow and a low risk of death or injury from accidents. If everyone drives dangerously, everyone is worse off: we have turbulent traffic flow and a relatively higher risk of death and injury. If everyone else is driving dangerously, there's little benefit to driving safely: other drivers' habits still make traffic turbulent, and one is typically at risk from other drivers' behavior than one's own. (One may furthermore even suffer a net loss: time is valuable, and driving safely in a dangerous environment can slow one down considerably.) If everyone else is driving safely, there's an individual benefit to driving dangerously: one can exploit the overall laminar traffic flow to one's own time benefit. Since other drivers' behavior determines risk, the additional risk is mostly externalized to others.

Furthermore, the benefits of everyone driving safely are clear and nearly universal. Even those who would prefer to drive dangerously while everyone else drives safely know that they are better off driving safely than they would be if everyone drove dangerously.

Kant does indeed have at least the beginning of a point: I would approve more of a person who drives safely because they value the mutual benefit of a safe and efficient traffic system than I do of a person who doesn't care about the mutual benefit and merely drives safely to avoid the penalties of enforcement. On the other hand, I have the pragmatic problem of trusting other people to actually drive safely, and convincing them to trust me to drive safely. I'd like to know that people are virtuous, but I can't expect them to be suckers; I can't expect them to allow their feelings of virtue to make them targets of exploitation. I can't trust someone who claims, however strenuously, only that they do indeed have a Kantian motive — a person who does not have a Kantian motive would certainly lie about having one. Paradoxically, I can effectively persuade people that I have a Kantian motive by supporting a non-Kantian enforcement mechanism: only someone with a Kantian motive has nothing to lose by enforcement, and I'm happy if someone without a Kantian motive "insincerely" supports enforcement. Contrawise, if there is no enforcement, I won't drive safely even if I do have a Kantian motive — at least in the sense of valuing the mutual benefit — because the mutual benefit will not occur just because I personally drive safely.

Compare and contrast the driving scenario with the academic ethics scenario. From the perspective of the individual students, academic ethics is not a Prisoner's Dilemma/Snowdrift game, it is a zero-sum game. Some students will get A's at the expense of those who get B's, at the expense of those who get C's, D's and F's. Those who who get lower grades will have a lower economic, academic and social reward, and those who actually fail will incur sometimes enormous economic expense (pass or fail, tuition is non refunded and one still has to pay back one's student loans). All cheating does is change the distribution of the benefits and expenses, not their overall magnitude. Of course there are larger social benefits to having an honest academia, but these social benefits are cold comfort to a failed student repaying twenty thousand of dollars in student loans with a low-wage, low-status job.

Furthermore, I would speculate that cheating or the desire to cheat is more prevalent at the lower end of the spectrum of academic performance. People who would otherwise fail are tempted to cheat their way to a C; those who would normally get A's and B's would seem less interested in cheating. A and B students seem inclined to go into professions where actual competence matters. A C student who cheats his way to an A would be quickly found out when his competence fails; he is better off in a selfish, non-Kantian sense with an honest C than a dishonest A. Similarly, even if someone who could honestly earn a B cheats his way to an A, he might still fail to develop even B-level competence; his dishonest A will be worse than his honest B. If there's rampant cheating among the D and F students, only the honest C students have a perverse incentive to cheat to get the C they honestly deserve.

The difference between C students and D/F students (and those who don't go to college) is primarily an arbitrary status difference: graduates had the financial and social wherewithal to actually pay for college and spend four years not working. The actual competence gained by just passing — primarily self-discipline and basic literacy — can be more easily and more inexpensively gained and demonstrated in other ways. Likewise, the difference between A and B students is primarily a status difference: both — if honest — are adequately competent, and I doubt (but I might be mistaken) that there is little empirical distinction between the post-academic performance of A and B students.

If the distinction is primarily arbitrary status, not actual competence, then there is little immediate reason to prefer those who gain status by one means or another; being a high status graduate through cheating is just as good (in an immediate sense) as being a "honestly" high status graduate; since status is not correlated with competence, we could just as easily use height or hair color. Insofar as immediate competence matters, academic honesty is self-enforcing (or has external enforcement); we do not need to appeal to a Kantian moral sense just as we do not need to appeal to a Kantian moral sense to not drop bowling balls on our feet.

It's notable that individual performance-simulating cheating in the private workplace is nearly non-existent; "cheating" there is mostly using work time for non-work-related activities. In the workplace, performance is directly measurable; if the job gets done, it's done, and employers are typically unconcerned with how it got done. If you have a task that has already been done elsewhere, it is perfectly acceptable to simply pay for and import the fully-complete task in toto. Likewise, collaboration — usually a big "no-no" in academia — is actively encouraged in the workplace: the point is not to measure the individual's performance, but to get the job done.

Westacott presents honor codes as somehow more Kantian, but this position is suspect. Honor codes do entail some surveillance, and entail consequences if cheating is somehow discovered. Indeed, honor codes simply move some of the surveillance to the student body itself, and rely on the superior students immediate self-interest in reporting cheating: an A student has nothing to gain by even tolerating others' cheating, much less assisting it. We can no more determine under an honor code than strict surveillance whether the motivation for compliance is due to to a Kantian moral sense or avoidance of the immediate consequences of cheating. We're inculcating a Kantian moral sense — in the sense of doing something by sheer duty — by making the student body responsible not for compliance but implementation.

As I mentioned earlier, there are larger social considerations to inculcating a sense of honesty in not only college students, but also the general population. But simply expecting a Kantian moral sense — in either the sense of preferring mutual benefit or acting from a sense of duty — without taking any direct, immediate steps to physically inculcate that sense seems to rely on magical thinking; when effective, a Kantian moral sense always relies on some method, which might be covert, of direct enforcement.

There's another dimension to the issue specifically of surveillance and enforcement that Westacott's additional examples put in a sharper light.
Or compare two workplaces. At Scrutiny Inc., all computer activity is monitored, with regular random audits to detect and discourage any inappropriate use of company time and equipment, such as playing games, emailing friends, listening to music, or visiting internet sites that cause blood to flow rapidly from the brain to other parts of the body. At Probity Inc., on the other hand, employees are simply trusted to get their work done. Scrutiny Inc. claims to have the lowest rate of time-theft and the highest productivity of any company in its field. But where would you choose to work?

One last example. In the age of cell phones and GPS technology, it is possible for a parent to monitor their child’s whereabouts at all times. They have cogent reasons for doing so. It slightly reduces certain kinds of risk to the teenager, and significantly reduces parental anxiety. It doesn’t scar the youngster’s psyche – after all, they were probably first placed under electronic surveillance in their crib when they were five days old! Most pertinently, it keeps them on the straight and narrow. If they go somewhere other than where they’ve said they’ll go, or if they lie afterwards about where they’ve been, they’ll be found out, and suffer the penalties – like, their cell phone plan will be downgraded from platinum to regular (assuming they have real hard-ass parents). But how many parents really think that this sort of surveillance of their teenage kids is a good idea?
The overriding issue here has nothing whatsoever to do with inculcating any sort of Kantian moral sense. The issue, rather, is whether or not employers and parents can be trusted to use surveillance and enforcement only for a mutual benefit including their employees and children. Neither employees nor children are the slaves of their employers or parents, and excessive surveillance compromises their primary benefits of autonomy and privacy.

It is not at all clear too whether strict surveillance in the workplace — despite claims to the contrary — actually improves actual productivity, which is difficult to measure directly; I know from direct experience that strict workplace surveillance is often used to reinforce status distinctions and social dominance relations between management and workers; workplaces with strict social hierarchies are not necessarily more productive than those with a more collaborative and equalitarian atmosphere.

Rather than objecting to surveillance because it fails to develop or hinders development of a Kantian moral sense, we object to surveillance in these examples because it's just bad in itself.

We can draw the larger conclusion that when strict surveillance and enforcement of some behavior appears intuitively objectionable, we have not fully understood the game in which the surveillance and enforcement is taking place.

In the case of traffic enforcement, driving safely itself has clear and unambiguous instrumental utility, and the surveillance and enforcement acts predominantly to create a mutually beneficial outcome that would be impossible or unstable without the surveillance and enforcement. The surveillance and enforcement do not act to inculcate a Kantian moral sense, they act rather to protect those who (somehow) develop a Kantian moral sense — in the sense of directly preferring a mutual benefit to exploiting others — and ensure they are not exploited or made suckers.

In the case of academic honesty, among the students there is not a Prisoner's Dilemma situation: academic honesty enforces one particular zero-sum game over another, and the larger social benefit of the "honesty" game does not (under present circumstances) outweigh the direct negative consequences for those students who lose that game. Surveillance and enforcement do not protect those who develop any sort of Kantian moral sense, since honest failures suffer negative consequences just as severe as detected cheaters.

Once we understand what social "game" is being played and how it is being played, we can construct systems of surveillance and enforcement that use immediate self-interest to select against truly undesired outcomes; where the desired outcome is of mutual benefit to all parties, a Kantian moral sense will develop automatically. Where a Kantian moral sense does not develop automatically, there is not Kantian moral sense to develop — no mutual benefit — or the game has been set up or played irrationally or for covert purposes.

Friday, April 09, 2010

The economics of dueling

The economics of dueling [pdf]:
Recent historical research indicates that ritualistic dueling had a rational basis. Basically, under certain social and economic conditions, individuals must fight in order to maintain their personal credit and social standing. We use a repeated two-player sequential game with random matching to show how the institution of dueling could have functioned as a costly but incentive-compatible means by which individuals could demonstrate their good faith dealings by defending their "honor".
Fundamentally, the authors show how a Prisoner's Dilemma-type game is transformed to a win-win (overall) game through the use of coercion.

While dueling is technically not a "state-imposed" solution, the practice requires larger institutionalized social constructions. Specifically, legal enforcement against ordinary murder must be knowingly (although implicitly) relaxed.

Also, dueling is viable (in a game-theoretic sense) only in fairly restrictive circumstances. Outside the parameters the authors describe, other solutions are more effective and less costly.

(via Bruce Schneier)

Monday, March 02, 2009

Taxation and the Prisoner's Dilemma

Taxation, where the cost of some activity is spread out proportionally instead of by specific use, is a perfect example of a Prisoner's Dilemma situation.

If I and my neighbor are subject to taxation, the "rational" solution for either of us is to have the other pay his taxes and avoid our own; we get the benefit of what the taxes pay for without personally incurring the costs. If my neighbor pays his taxes, I'm better off not paying my own; if he doesn't pay his taxes, I'm still better off not paying my own. On the other hand, we're both better off if we both pay our taxes than if neither of us do. That's a textbook example of a Prisoner's Dilemma in real life.

Hence I talk in Supply-side and demand-side communism about using the coercive power of the state to fulfill people's needs for survival: the government taxes everyone to pay for everyone's basic needs. The coercive power of the state is used not to make an individual do what is not in her best interest, but rather to counteract the Nash equilibrium and ensure for each person that his neighbor is not taking a "free ride" and acting in an exploitative manner. Without coercion to ensure fairness, everyone's "rational" decision would be to not pay taxes, to their mutual detriment.

Friday, March 28, 2008

Simplicity

In comments, Alonzo Fyfe asserts he can avoid the complications of game theory. I think he can only hand wave around them.

He charges that "PD situations are highly contrived and ignore many real-world facts that are morally relevant..."

Well, yes. The analysis of a simple game is not the analysis of a complex game. Game theory is a rich and varied field with volumes of serious academic work. As a philosopher, though, I'm entitled to look at how the essential features of some simple games illuminate our fundamental understanding of ethics.

Alonzo might as well argue against the "oversimplification" of gravitation. It is certainly the case that just m1*m2 / d2 ignores many complications in actually describing the motions of planets. But without a sound fundamental understanding of various ideal cases, we can't get very far in making sense of complex phenomena.

The specific cases Alonzo gives, "variable pay-offs, the possibility of anonymous defection, the possibility of deception, and the possibility of affecting desires," all (except perhaps the last, which I shall address in a moment) easily handled by more complicated game-theoretical analysis. But these more complicated analyses don't change anything about the fundamental way we interpret game theoretic analysis in an ethical sense.

He asks, "What happens if we raise our children so that they simply acquire a desire for cooperation or an aversion to defection?"

That seems like an overly simplistic strategy, leading to a susceptibility to exploitation. More importantly, even if children were infinitely labile (which they're not) it begs the question: why should we raise our children thus? Why is cooperation better than defection?

Should we not give our children a sound theoretical understanding of what's going on, so they can analyze and respond to complicated situations where simplistic strategies will not suffice?

To a certain extent, as Alonzo mentioned earlier, we can indeed change desires, to a certain extent. But which desires should we change? Why? How can we justify making those changes? These are all questions that game theory and meta-game theory attempt to answer in a consistent manner.

His next objection,
If we look at your original account from Wednesday's post, and raise children so they assign 2 units of value to cooperation itself, then the value of cooperation increases from 3.3 to 5.3,and exceeds the value of defection. We solve the same problem without any of the complexities of game theory.
is difficult to understand. Alonzo has to use game theory to perform this analysis and reach the conclusion that we should change the game by raising children in a particular way. He has employed game theory, not avoided it.

Real-life examples

John Morales asks for some real-life examples of game theory in ethics.

In many instances where game theory intersects real life, we just play the game according to the (local) Nash equilibrium. The phrase "All's fair in love and war" says that one is free to pursue the strategy that will bring the greatest immediate individual benefit. We construct specifically ethical systems when for one reason or another, we have to go "outside" the game to achieve what we intuitively feel is the best overall outcome.

For example, I can go into a restaurant, eat a meal, and then be presented with the bill. This is an example of a related game, the asymmetric closed bag exchange. Regardless of whether or not I'm actually served a good meal, I am always "better off" not paying (I get to eat the meal and keep my money). Paying before I eat (like at McDonalds) just changes the asymmetry; whether or not I pay, it's always "better" for the restaurant to not feed me (defect) once I've paid (cooperated).

The Pareto optimum (and usually the global maximum), though, is for the restaurant to serve me a good meal and for me to pay.

In a small community, we can play tit-for-tat. If I don't pay on Monday (he cooperates, I defect), the restaurant won't serve me again until I pay without eating (he "defects", I cooperate). However, a rational person with foresight will simply see the outcome of the repeated iterations. We call this foresight the ethical evaluation that you should pay for your meal.

In a larger community, where there are more non-communicating restaurants than I can eat meals, tit-for-tat doesn't work; I can play as many one-shot games as I like without fear of reprisal. So we make laws which follow from our idealized tit-for-tat strategy (i.e. good laws follow from good ethics).

But we can observe that the law is relatively easy to circumvent: There isn't a police officer standing at the door to every restaurant. Instead, we cultivate in ourselves ethical habits. In this case, the the thinking is one level more abstract: If too many people in general were to eat without paying, no one (myself included) could eat at restaurants, so we police ourselves.

There are other examples. I can work hard (cooperate) or slack off and just look busy (defect); my company can give me a raise next year (cooperate) or stiff me (defect). As an exercise, use game theory to relate the Communist slogan, "From each according to his ability, to each according to his need," with the cynical Soviet observation, "we pretend to work, and they pretend to pay us."

A lot of human behavior can be modeled just by reducing it to pure game theory and locally rational choice. But as the Prisoner's Dilemma shows, some situations are not so easy to reduce, even in theory. It is precisely those Prisoner's Dilemma and similar games which cause us to go outside the game and create ethics and laws.

Thursday, March 27, 2008

Ethics and game theory

Atheist ethicist Alonzo Fyfe weighs in with some objections to my earlier post on the Prisoner's Dilemma.
I am afraid that I do not find the prisoner's dilemma to make ethics interesting. In fact, I seldom find its relevance to ethics.
I spoke somewhat loosely in my earlier post. To be more precise, game theory in general is what makes ethics interesting, specifically those elements of game theory where Pareto optima and Nash Equilibria conflict or are undefined or ambiguous. The Prisoner's Dilemma is the best known of these sorts of games, and we can generalize the analysis of the Prisoner's Dilemma to related games, such as the Stag Hunt).

I'm not, of course, the only person interested in the connection between ethics and game theory. From the Wikipedia article:
While it is sometimes thought that morality must involve the constraint of self-interest, David Gauthier famously argues that co-operating in the prisoners dilemma on moral principles is consistent with self-interest and the axioms of game theory. It is most prudent to give up straightforward maximizing and instead adopt a disposition of constrained maximization, according to which one resolves to cooperate with all similarly disposed persons and defect on the rest. In other words, moral constraints are justified because they make us all better off, in terms of our preferences (whatever they may be). This form of contractarianism claims that good moral thinking is just an elevated and subtly strategic version of plain old means-end reasoning. Those that defect can be predicted because people are not completely opaque.

Douglas Hofstadter expresses a strong personal belief that the mathematical symmetry is reinforced by a moral symmetry, along the lines of the Kantian categorical imperative: defecting in the hope that the other player cooperates is morally indefensible. If players treat each other as they would treat themselves, then off-diagonal results cannot occur.
I find myself in almost complete agreement with this point of view.

Game theory takes the magic out of ethics. Instead of talking about some mysterious, mystical ethical realism, a game-theoretic ethics makes ethics about something physical — or at least no more unphysical than consciousness in general. Ethics is about the mind directly, about our subjective preferences and desires and about the complex emergent properties of the subjective preferences interacting in a society. Even better, games such as the Prisoner's Dilemma show us that a game-theoretic understanding is not an over-simplification of our ethical intuitions.

Alonzo complains that the Prisoner's Dilemma is "contrived".
It is a highly contrived situation that some skillful interrogators may put into practice to extract confessions from the accused [I'm curious if Alonzo actually read the Wikipedia article], but it does not describe a real-world situation.
But it is not so much contrived as it is stripped to its essence. In a very simple game, which can be described precisely in just a few sentences, we find profound emergent properties that force us to reconsider our notions of rationality itself. Any philosopher interested in something other than bloviation and obfuscation should, I think, admire the simplicity, clarity and depth of the Prisoner's Dilemma.

Wednesday, March 26, 2008

The Prisoner's Dilemma

The Prisoner's Dilemma is what makes ethics interesting. It's an apparent paradox. Given some game with the payoff matrix:
CooperateDefect
Cooperate3,30,5
Defect5,01,1

A rational person would prefer the Cooperate/Cooperate outcome [Pareto optimum] (and gain 3) to the Defect/Defect outcome (and gain 1). However, it is the dominant strategy to defect [Nash Equilibrium]: for either of my opponent's strategies, defection always has the higher payoff to me. If my opponent defects, then I win 1 if I defect instead of 0 if I cooperate; if my opponent cooperates then I win 5 if I defect instead of 3 if I cooperate. Of course, defection is my opponent's dominant strategy as well. So on one analysis, Cooperate/Cooperate is the preferred outcome; on another analysis, Defect/Defect is the preferred outcome.

There are two ways to resolve the Prisoner's Dilemma for the Pareto optimum. The first is to iterate the game an indefinite number of times and play "tit-for-tat". But it's not always possible to iterate a game indefinitely.

The second is to change the game by changing the payoff matrix, usually by threatening external punishment for defection. For example, if the townspeople agree to get together and beat either of us senseless if either cheats in a closed bag exchange, then we've changed the payoff for defection from 0,5 to 0,-1000, making cooperation the dominant strategy, and making one choice unambiguously rational (the Pareto optimum is also the Nash Equilibrium).

However, if you can change the game in one way, you can change it in other ways. We can just as easily change the game so that it's asymmetric, making it for instance always rational for brown-eyed people to cooperate and blue-eyed people to defect by punishing only brown-eyed people for defecting. Changing the low-level game just makes another Prisoner's Dilemma at a more abstract level: Cooperate becomes "make the game symmetric" (or make good laws/obey the law); "Defect" becomes "make the game asymmetric" (or make bad laws/disobey the law); the Nash Equilibrium is for the authority to make bad laws and the subjects to disobey them.

For this reason, purely authoritarian approaches to solving the Prisoner's Dilemma always fail. The authority will try to change the game for its exclusive benefit, sooner or later depending on how rational and intelligent the authority is; the subjects will then resist the authority. Authoritarian solutions to the Prisoner's Dilemma are dynamically unstable.

To a certain extent, religion can coerce the Cooperate/Cooperate outcome by changing the game: God will punish you for defecting and/or reward you for cooperating (and perhaps reward you more if your opponent defects). This is the sense I spoke of earlier of how religion could have an overall beneficial effect. The effect is weak; religion is a dynamically unstable authoritarian strategy.

(Pseudo-authoritarian solutions are possible, but only where you have competing authorities, who themselves play tit-for-tat. However, within-authority conflicts will always revert to the Nash Equilibrium.)

Democracy is a clever way of solving the Prisoner's Dilemma in a dynamically stable way. The people and the government play a abstract-level game of tit-for-tat — we need to make sure that only one abstract-level authority/submission game is iterated — and can thus implement any number of concrete-level Pareto optima where direct iteration is impractical. Of course, like any system in dynamic equilibrium, democracy is susceptible to sufficiently large "random" forces to push it to a state where the equilibrium cannot be maintained. If the government becomes too powerful, the people cannot "punish" the government sufficiently and the government can "defect" with impunity; likewise a too-weak government invites the people to "defect" into anarchy and chaos.