by Gerald Gaus
Notice that Hong and Page’s conditions (i) and (ii) are essentially the conditions that, on other grounds, I have argued are presupposed by ideal theory: the landscape is not too smooth and not too rugged (§II.2.4). Too smooth and the optimization problem can be solved by simply “climbing” the gradient; Sen’s climbing model is perfectly adequate to this task. Too rugged, and we are all “dumb.” Given this convergence of the conditions of the proof with the underlying problem ideal theory presupposes, the proof appears manifestly apropos to the modeling of ideal theory.
(iii) The third condition supposes that in the relevant group of problem solvers, the only element of the domain that is an optimum for each and every member of the group is the global optimum. Think back to our searching for the best rights-protecting state (§III.1.2). By the time we included our third perspective, the team shared three optima: the global optimum (the Czech Republic), and two lower optima (Brazil and Serbia). Our team with three perspectives thus did not meet condition (iii), and that is why it is not certain they would find the global optimum. If we add other perspectives in which neither Brazil nor Serbia are optima, then the handing-off-the-baton dynamic will inevitably lead to the Czech Republic, as it would be the only optimum shared by all perspectives.
(iv) This brings us to the fourth condition: our collection of ΣV perspectives, seeking to solve the problem, must be drawn from a large and diverse set of perspectives, in particular a diverse set of similarity orderings. ΣV itself must be a goodly sized group; the more difficult the problem, the larger the group should be. This fourth condition allows that the group of problem solvers certainly need not be the whole population (the transaction costs of joint problem solving in such a large group would be very high), but our collection of ΣV perspectives must still be diverse and “contain more than a handful of problem solvers.”14
Given conditions (i)–(iv), Hong and Page derive the “Diversity Trumps Ability Theorem”: in our terms, a randomly selected group of ΣV perspectives will outperform a more homogeneous group composed of the “best problem solvers”—those perspectives with the smoothest optimization landscapes. Moreover, Hong and Page’s simulations show that even if these conditions are not perfectly met, “a random collection of agents drawn from a large set of limited-ability agents typically outperforms a collection of the very best agents from that same set.”15 For our purposes the crucial point is not the formal theorem’s specification of the conditions under which randomly selected ΣV perspectives are guaranteed to beat the best perspectives, but the dynamic that drives diverse groups to generally outperform homogenous perspectives, even very good ones.
When I analyzed the elements of a perspective in chapter II (§1), some may have thought that similarity and distance measures would be difficult to devise, and so are controversial. The Hong-Page theorem indicates that this is all well and good. If adherents of an ideal theory could agree on the evaluative standards, relevant world features, and mapping relations, but came to different conclusions about the similarity ordering and the distance metric, they could more effectively locate the ideal than if they had agreed on all five elements. The core lesson to be learned is that different ways of looking at an optimization problem are more effective than looking at it in the same way, even if that is the best way.
2 DILEMMAS OF DIVERSITY
Hong and Page’s work on diversity is important and remarkable; I certainly do not wish to disparage it. Indeed, political philosophers should pay it much more attention. Landemore’s excellent work has led the way, making a powerful case for the applicability of Hong and Page’s work to political problem solving.16 Nevertheless, as I shall show in this section, it relies on critical additional assumptions that are not always clear; when we interrogate these assumptions, we shall find that its implications for ideal theory are far less sanguine than first appears.
2.1 The Neighborhood Constraint (Again)
The most obvious limitation of its results for thinking about ideal theory is that it does not recognize a Neighborhood Constraint (§II.3). As was explicit in our example of the search for the best rights-protecting state, each perspective has full knowledge of the justice scores of each element in the domain. We supposed that each perspective, when some world is brought to its attention, knows how that world scores in terms of justice (the y-axis). And this is critical to the analysis: when the per capita GDP perspective proclaims that it is stuck at Moldova with a score of 9, the economic liberty perspective locates Moldova on its perspective, agrees that it is a 9, sees that it is not a local optimum, and so can carry the baton further than 9, and (as it turns out) can go up to Macedonia, which scores 11. Now if the set of worlds to be searched is itself a neighborhood on which all perspectives converge, then within that neighborhood the Hong-Page theorem is applicable to our problem; it nicely shows how diverse ΣV perspectives can better explore a common neighborhood. But the rub here is that diverse perspectives tend to disagree on the neighborhood—which is precisely why they can help each other. A neighborhood of the domain {X} is a function of a perspective’s similarity ordering (SO) and its distance metric (DM). To put the point somewhat simplistically: it is precisely because diverse ΣV perspectives concur on the ordering of the elements on the y-axis (the justice scores) while disagreeing on the ordering of the x-axis that the diverse group can climb up the y-axis. But the diversity of the x-axis, unless very constrained, inevitably produces a diversity of neighborhoods. If we again go back to our perspectives on rights protection, on the GDP perspective Brazil and Romania are neighbors, while on the economic liberty perspective they are far apart; it is this very diversity of neighborhoods that drives the result, but which severely limits its applicability to ideal theory. Let us call this:
The Neighborhood Diversity Dilemma: Diversity of ΣV perspectives improves the search within a neighborhood, but as we increase diversity of ΣV perspectives, they disagree about what our current neighborhood is.
Given that the heart of the Hong-Page theorem is the benefits of high diversity, but high diversity almost surely means the perspectives disagree on the neighborhoods, its applicability to our problem seems limited indeed.
2.2 The Theorem and Actual Politics
It needs to be stressed that this does not imply that the Hong-Page theorem is of limited applicability in all political contexts, such as collective deliberation. If a group concurs on the domain of options and the scores of each option are known by (or agreed to by) all, Hong and Page’s analysis gets real traction in explaining why collective decision making is apt to outperform individual judgments—even expert ones—in actual deliberative contexts. Landemore and Page have recently argued that consensus in identifying the best solution to a problem is a plausible assumption in many political contexts. “We assume then that participants have already reached consensus on the criteria for evaluation [ES] and how those criteria will be weighted [MP (part ii)].”17 Note that they suppose agreement on some of the elements of a perspective as understood in this work, the evaluative standards and the mapping function’s weighting task.18 Given this, they argue that we should expect consensus on what constitutes the best solution (the global optimum); indeed, given how we have understood a perspective here, this may seem to follow.
The problem, however, is that agreement on simply the evaluative standards and the weighting procedure will produce agreement in the overall evaluation of options only if the evaluation does not depend on predictive modeling of how the features of the option will actually function together. As Landemore and Page note—and as we have seen in §II.4.2—in predictive tasks disagreement may lead to better results than consensus.19 If, however, Alf’s conclusion about the ultimate value of an option depends on predictions about how that option will function, and so how well its functioning will meet the shared evaluative standards, Betty will always agree with him on the global optimum (and the value of less-than-optimum solutions) only if she also shares his predictive models.
To s
ee this better, consider Landemore’s example in which she postulates a problem for the French government in selecting a city for an experimental program.20 “Three députés are deliberating, one from Calvados, one from Pas de Calais, one from Corrèse. They are aware of different possible solutions …, each of which have a different value for the experiment. On a scale of 0 to 10, a city with a value of 10 has the highest objective value for the experiment. Each of the cities that a given député might offer count as a local optimum. … The goal is for the group to find the global optimum, that is, the city with the highest objective value.”21 The députés and their perspectival optima are summarized in figure 3-4. We can see that, as required by assumption (iii) of the Hong-Page analysis, the only optimum shared by all three is Caen, the global optimum. On something like Landemore’s version of the story, as given in figure 3-4, Alfred might get stuck (indicated by ) at his local optimum at Marseille; Betty can carry the baton to Paris, but she gets stuck at a local optimum there (); Charles can take it to Grenoble before halting at a local optimum () but Alfred can take the baton back, and arrive in Caen.
Figure 3-4. Landemore’s example
The important assumption here is not simply that the députés agree on their evaluative standards and trade-off rates, but that all three députés, when they run their predictive models of how well each city will serve these agreed-on weighted evaluative standards, concur on the best city. This would be the case only if, as I have said, they all share an evaluation normalized perspective ΣV, which includes the same predictive models, and so once a particular solution is pointed out, all members of ΣV concur on the predicted value of the option. This is why, I think, the example looks a bit contrived as a case of actual politics. What we would expect is variance of predicted outcomes when they apply different predictive models. Even if they share the same fundamental values, then, people who employ different predictive models are apt to disagree on the value of options.22 Their different predictive models will lead them to disagree on the overall value of the options—their ordering along the y-axis, not just how they array them along the x-axis. The assumption that they all share the ΣV perspective, and so entirely agree about the scoring of each option in the domain, turns out to be very strong indeed, at least in many instances of actual politics.
At this point we might invoke the Diversity Prediction Theorem (§II.4.2), and hold that the députés could pool their predictive models. As Landemore and Page consider, the députés might engage in a high-level deliberation about the comparative benefits of their different models, which can improve the toolkits of each. But Landemore and Page are hesitant about recommending a procedure that leads to consensus on predictive models: “it need not be more advantageous to reach a greater consensus following deliberation in the predictive context.”23 As they show, if each predictor moved a step closer to the average prediction, no gains in group prediction would be generated.24 This is a consequence of the Diversity Prediction Theorem; even if Alfred’s moving toward the average prediction increases his predictive reliability, the group as a whole has lost an element of predictive diversity, and given that a unit of predictive diversity is equal to a unit of predictive accuracy, the average predictive performance of the group will not be increased.25 We might imagine a deliberation in which the députés would first make their models explicit, discuss how they work, pool them for some purposes, but also continue using their original, divergent, predictive models. Thus in one sense they agree (insofar as they are using the pooled results), and in another sense they disagree (because they continue to use their diverse predictive models), as to which city would work out best for the experiment. It is hard to see how this would work in practice, much less as an approximation of actual politics.
We thus must wonder about the broader persuasiveness of Landemore’s claim that the “four conditions for this theorem are not utterly demanding.”26 Although this may be plausible if we restrict ourselves to simply the formal conditions (i)—(iv), the worry is about the supposition that all share, or can be brought to share, the full ΣV perspective. For us to be confident that Hong and Page’s model really applies in a case such as this, the different perspectives must employ predictive models that agree about the scores of each city. Of course it does not follow that as soon as any disagreement on the scoring of the elements is introduced the Hong-Page search model becomes irrelevant, but it does mean that the more it is the case that the perspectives score the options differently, the less applicable the theorem. When perspective Betty announces that she has moved from Marseille at 7 but is stuck at Paris with a score of 8, Charles might reply “pas question!”—you just moved from 7 to 6!
2.3 The Utopia Is at Hand Theorem
Although the Hong-Page Theorem does not recognize neighborhoods in which confidence about the terrain is much higher than in outlying areas, Page discusses a second theorem that can do so. According to what he calls the “Savant Existence Theorem … for any problem there exist many perspectives that create Mount Fuji landscapes.”27 There are always arrangements of the elements in {X} (social worlds) that create Mount Fuji landscapes. Showing this is trivial in our simple one-dimensional similarity space: take the ordering of scores on the y-axis from high to low, and rearrange the x-axis to correspond to this ordering. This will yield a Mount Fuji landscape. There are many such possible landscapes for any optimization problem. If we can show that our problem is a smooth optimization landscape, the conflict between local and global optimization is entirely obviated (§II.2.3). Note that in principle for any landscape, no matter how rugged, there exist alternative arrays of the x-axis that generate a smooth optimization problem. This is, I think, the motivation behind interpreting the second condition of the Hong-Page theorem in terms of the smartness of the perspective rather than the difficulty of the problem: there is, in principle, some perspective that turns every difficult problem into a simple one.
A more modest version of the Savant Existence Theorem might be called:
The Utopia Is at Hand Theorem: There are in principle ΣV perspectives according to which Σ’s ideal is within our current neighborhood.
This is a “more modest” version of the Savant Existence Theorem as it does not require a reordering of the similarity dimension such that for all social worlds in {X} there is a smooth optimization landscape. It requires “only” that a subset of {X}, which includes the current world and the global optimum (and we assume some other nearby social worlds) are ordered such that they form a neighborhood. Neither does it require that within this neighborhood there is a smooth, Mount Fuji landscape. “All” that is required is that Σ’s global optimum is within our current neighborhood. We have a potentially compelling result: there is in principle always some ΣV perspective on the problem of ideal justice that shows that Σ’s utopia is in the neighborhood of our current social world. This would mean that The Choice (§II.3.3) may be avoided—pursuit of the ideal and of local justice can be one and the same.
In the abstract it may seem easy to rearrange the elements of a domain. What is not easy is to arrange them in a way that expresses an intelligible similarity ordering of the features of the relevant social worlds (WF)—to show that this new arrangement of the domain {X} exemplifies a meaningful structure that relates social worlds (§II.1.2). If I think that, say, a socialist camping trip—like utopia—is far away from our current world, and you arrive at a perspective in which it is adjacent, this could be immensely enlightening—if I can agree that the way you have structured the worlds really does capture similarities that mine has missed. And then I may come to the conclusion that I can share your deeper knowledge of the ideal, as it is much more like our current world than I ever contemplated.28 But if your perspective on {X} seems arbitrary or implausible, or misses critical characteristics, I will dismiss your “savant” perspective, and its claims to have brought the ideal into my neighborhood. It is no mean feat to impose a meaningful structure on social worlds in a way that brings very cl
ose what looks, on my current perspective, to be very far away.29
In social thought such revolutionary changes in perspective have no doubt occurred. Perhaps liberalism itself was such a reconceptualization. At one point Western societies faced the problem of which false religions to tolerate. Even in his Letter concerning Toleration, Locke still was struggling with this view. While he thought it would promote the good of the commonwealth to tolerate dissenting Protestants, extending toleration to Catholics might decrease justice (an England that tolerated Catholics was far from his own); and extending toleration to atheists would be even a further social world.30 But, no doubt without being fully aware of it, Locke was pushing toward a more Mount Fuji liberal landscape, in which each additional right of conscience and speech advanced justice. Eventually the early modern problem of which false creeds to tolerate was transformed into the problem of increasing freedom of thought and belief, where the options of which creeds to tolerate were arrayed in something much closer to a smooth optimization landscape. Once one arrays the social worlds in terms of their liberty of conscience, the world in which Catholics are tolerated is very close to that in which Protestant dissenters are, and tolerating atheists is just a step beyond that.