Mobility across Multiple Generations
What happens to the intergenerational correlations of status as we consider grandchildren, great-grandchildren, and later generations? To answer this question, a further simplification is helpful: to normalize y, the measure of status, in each generation to have zero mean, so that equation A1.1 simplifies to
If y is income, for example, just defining income as the difference between the individuals’ income and average income creates this normalization.
Suppose all the information useful to predict the outcomes for children is provided by the status of the parents, so that the status of grandparents and even earlier generations provides no independent information on the likely outcomes for their descendants. In this case the mobility process is said to be first-order Markov, or AR(1). Then equation A1.1* implies that over n generations, the characteristics of the link in status is
where . The correlation between grandparents and grandchildren is b2, and between great-grandparents and great-grandchildren b3.
In this case, given the conventional estimates of the intergenerational correlations of parents and children, long-run social mobility is rapid. Even when b = 0.5, bn rapidly approaches zero as n increases. Thus the intergenerational correlation between one generation and their great-grandchildren is only 0.12. This in turn implies that only 2 percent of the variation in outcomes for great-grandchildren is explained by the characteristics of the first generation. The share of variance of status explained by the status of the current generation after n generations goes even more quickly toward zero, since it is b2n. Figure A1.3 illustrates how rapidly the expected status of two families, with initial wealth twelve times and one-twelfth of the mean respectively, converges on the mean if b is 0.5.4 Within five generations, the descendants of these two families, whose initial wealth differed by a factor of 144, will both have an expected wealth within 10 percent of the social average.
FIGURE A1.3. The rapidity of convergence to the mean for wealth.
Recent studies of social mobility that look at outcomes over three or even four generations suggest, however, that grandparents seem to have an independent influence on grandchild outcomes. In this book, the hypothesis of the nature of intergenerational mobility and persistence assumes that the underlying process is actually first-order Markov. Grandparents inherently do not influence grandchild outcomes once we have full information on their parents. Thus, if measured status is yt and underlying status xt, the social mobility model assumed in this book (see chapter 6) is
yt = xt + ut,
xt = bxt − 1 + et
with x and y both distributed normally with zero mean and constant variance, and u and e random components. Suppose also that the ordinary least squares (OLS) estimate of β in the fitted expression
yt = βyt − 1 + νt
is . Then if , where is the variance of the random component and the variance of underlying social competence, the expected value of will be
E() = θb.
Also the expected value of the OLS estimate of βn, the observed correlation in y across n generations,
yt = βn yt − n + νnt,
will be E(n) = θbn.
If we estimate by OLS the parameters in
yt = βt − 1 yt − 1 + βt − 2 yt − 2 + νt,
which looks at the effect of grandparent status controlling for parent status, then
Even though grandparents have no independent role in child outcomes, they appear to have such an influence according to these estimates. If we estimate by OLS the parameters in
yt = βt − 1 yt − 1 + βt − 2 yt − 2 + βt − 3 yt − 3 + νt,
which looks at the independent effects of both grandparents and great-grandparents, controlling for parent status, then
If b > 0, then both t − 2 and t − 3 have positive value. Even great-grandparents, generally dead before the great-grandchild is born, will appear to exert some independent influence on great-grandchild outcomes.
1 Long and Ferrie (2013b) propose more complex measures to deal with cases in which such a cardinal measure of social status is impossible.
2 Assuming the correlation of the child with each parent individually is the same and is ρ, the correlation with the average of both parents is , where ρfm is the correlation of the parent characteristics.
3 On height, see Pearson and Lee 1903; Silventoinen et al. 2003a; Galton 1886. On body mass index: Silventoinen et al. 2003b. On cognitive and social abilities: Grönqvist, Öckert, and Vlachos 2010. On longevity: Beeton and Pearson 1899; Cohen 1964. On earnings: Corak 2013. On wealth: Harbury and Hitchens 1979. On education: Hertz et al. 2007. On occupational status: Francesconi and Nicoletti 2006; Ermisch, Francesconi, and Siedler 2005; Long 2013.
4 As is usual for these estimates, b is calculated for the logarithm of wealth.
APPENDIX 2: DERIVING MOBILITY RATES FROM SURNAME FREQUENCIES
WHERE THERE IS INFORMATION ON wealth or occupations by surname, the procedures for estimating the intergenerational correlation of status are analogous to those used in conventional mobility studies. The social mobility rate is measured just by how much closer to the mean status surnames of each type move with each generation.
The persistence parameter estimated for surname groupings, however, is potentially biased toward zero compared to the underlying persistence parameter for families (if it were observable). This is because in surname cohorts, when we estimate
measures average social status on some measure across a group of people with the surname k in the initial generation. But some of these people have no children and are not included in the within-family estimates. And in any generation, those with one child are weighted as much as those with ten children. This introduces noise into the estimates and biases estimated intergenerational elasticity toward zero.
However, for most of the studies in this book, the measurement of the status of surname groupings in any generation is based on the share of that surname among elites (or underclasses) compared to its share in the general population. These elites can be groups such as wealth holders, university graduates, authors, physicians, attorneys, or members of Parliament.
To extract implied persistence rates, the procedure is as follows. Define the relative representation of each surname or surname type, z, in an elite group such as physicians as
With social mobility, any surname that initially has a relative representation differing from one should tend toward one, and the rate at which it does so is determined by the rate of social mobility.
However, assuming that all social mobility is governed by
xt + 1 = bxt + et
implies that even social elites tend to have the same variance of status as the population as a whole, as long as they have been present in the society for a number of generations. For even if they start with a zero variance of social status, then n generations later, based on the above law of mobility, the variance of that underlying status will be
where σ2 is the status variance of the population as a whole. Even at a high underlying persistence rate of 0.75, after just one generation the variance of this elite will be 44 percent of the population variance. After four generations it will be 90 percent. Thus in estimating the persistence rate, b, from the shares of surnames observed among elites, it is assumed that the variance of the elite group is the same as for the general population, but the mean is shifted to the right, as in figure A2.1. Similarly, the underclass groups are assumed to have the same variance as the population, but with the mean shifted to the left.
This assumption of equal variance for elite and underclass surname groups is validated by measuring the distribution of their outcomes on status measures. We see in chapter 2 that this holds true for the income of people with aristocratic surnames in Sweden. In chapter 3 it holds true for educational status among Jews and blacks in the United States. And it holds true for the distribution of wealth among elite rare-surname groups in England for the period 1858–2011 (see chapter 5
). In all cases, there is considerable variance of outcomes within elite and underclass surname groups.
This assumption also fits the data well when initial elites or underclasses are observed over many generations, as in England or Sweden. In case after case, the model fits the evolution of elite and lower-class surname groups well, with the estimated persistence rates falling within a relatively narrow range, 0.7–0.9. Assumptions that the initial elite group has a more compressed distribution of status than the population as a whole lead to predicted paths of relative representation that do not fit with the observed data, unless persistence rates are very different for the initial generations than for later generations.
FIGURE A2.1. Initial position of an elite.
With the assumptions above, when the relative representation of an elite surname group z is observed in some upper part of the distribution of status, such as the top 2 percent, then we can fix the initial mean status of this group, . That mean status will evolve according to the equation
where t is the number of generations. For only two generations, this procedure yields an exact estimation of b. For multiple generations, we could either estimate a b for each generation or fit one b to the whole series by minimizing the deviations in relative representation implied by each choice of b. Studies of long series of relative representation of elites and underclasses in England, Sweden, and China show that often one fitted b fits the observed patterns of relative representation even across five to ten generations.
The value of b that best fits this data does not change much if the assumed cutoff point in the status distribution for the elite population is altered. Thus in chapter 5 (figure 5.8) we estimate the persistence of educational status from rare-surname groups at Oxford and Cambridge for 1830–2012 as 0.73. In arriving at this estimate, the assumed cutoff for the university elite in each period is changed to correspond to the student share in each cohort, which ranges from 0.5 to 1.2 percent. Suppose instead a uniform cutoff of 0.1 percent, 0.7 percent, 2 percent, or 5 percent was assumed (the extremes here being quite unrealistic). How much would that change the estimated value of b? The first row of table A2.1 shows the results. Adopting one of these fixed cutoffs across generations yields a best-fitting persistence rate of 0.69–0.74, little different from the preferred estimate.
TABLE A2.1. Intergenerational correlations under different assumptions for rare surnames at Oxford and Cambridge, 1830–2012
What would happen if the assumption that the variance of educational outcomes in the elite group was always the same as for the general population was dropped? Suppose the variance was only one-quarter that of the general population initially. The implied values of b for different elite shares are shown in the second row of table A2.1: they range from 0.65 to 0.71.
The last row of the table shows the estimated b under the even more extreme assumption that the variance of the elite surname group was only one-tenth that of the general population when first observed in 1800–1829. Now b is in the range 0.63–0.71. So the conclusion that educational mobility measured using surname groupings is slow relative to conventional estimates is robust to variations in assumptions about the population share of the elite groups observed and the variance of status within the elite population.
Upward Mobility
For elite groups that arise just by the processes of random chance in any economy, such as the rare-surname groups at Oxford and Cambridge in the years 1800–1829, the social law of mobility suggested in chapter 6 also has implication about the way in which they rose to elite status. The major implication is that the path of upward mobility is symmetrical with that of downward mobility. Chapter 12 shows empirical evidence that this prediction is correct in both England and China. Here is shown the reasoning behind this prediction.
If underlying mobility is governed by the expression xt + 1 = bxt + et, we would estimate, empirically, the value of b, minimizing the sum of squared errors, as
Suppose, however, we instead wanted to estimate the connection going backward from xt + 1 to xt. That is, if xt + 1 = bxt + et holds, what is the value of γ that would be estimated for the expression
xt = γxt + 1 + νt?
You might expect that we could just rewrite xt + 1 = bxt + et with xt on the left-hand side, and the result would be (1/b). But this is not the case. The minimum squared deviation empirical estimate of γ in fact would be
since xt and xt + 1, by construction, have the same variance. Going back in time, the average status of an elite or underclass again regresses toward the mean. The movement of families at the extremes of the distribution—extremes of wealth or poverty, education or ignorance—toward the center will be symmetrical with their earlier movement from the mean to the extremes. Any group observed at the extreme will not only regress to the mean in future generations, but it will also diverge from the mean to reach its extreme position at the same rate at which it returns. Notice, however, that this prediction only applies to families that reach the extremes of the distribution through random shocks.
APPENDIX 3: DISCOVERING THE STATUS OF YOUR SURNAME LINEAGE
FOR THOSE OF US WITH A COMMON SURNAME like Clark, there is a limit to the interesting exploration of the history and geography of the surname (though the geographic distribution of the spelling variants Clark and Clarke is striking). But a variety of sources allow a diverting exploration of the history and geography of rarer surnames. Below we show how to find out how common any surname is, where it is concentrated, and what its average social status has been over time for countries such as England, the United States, Australia, and Sweden.
As we have seen, common surnames may start with high, medium, or low status, but all eventually converge on mean status. Rarer surnames, however, can follow a variety of paths. They may, for example, spend periods at high status, regress to the mean, and then fall to low status before converging on the mean once more; or they may, like the surname Pepys, spend hundreds of years at high status.
Surname Frequencies and Distribution
A useful tool for establishing the frequency of surnames in various countries is the Public Profiler World Family Names database, the result of a project at University College London.1 This website provides estimates of the frequency of surnames per million of the population in Argentina, Australia, Austria, Belgium, Canada, China, Denmark, France, Germany, Hungary, India (partial), Ireland, Italy, Japan, Luxembourg, the Netherlands, New Zealand, Norway, Poland, Serbia, Slovenia, Spain, Sweden, Switzerland, the United Kingdom, and the United States. For each country the information is also given by subunits, which vary in size: in the United States, they are counties. Figure A3.1 shows, for example, the distribution of the surname Levy, which is Sephardic Jewish in origin, across Europe. The wide distribution of the surname reflects the great geographic mobility of the Jewish population.
In contrast, figure A3.2 shows the distribution within Europe of the rare surname Boscawen. It originated in Cornwall in southwest England and has dispersed little since. Figure A3.3 shows the distribution of the New France surname Bergeron in North America, illustrating its spread by migration to Louisiana and New England. At even closer perspective, we can see the distribution of surnames by counties within states in the United States. Figure A3.4, for example, shows the distribution in New York State of the Jewish surname Teitelbaum, a prominent surname among the leaders of the Satmar Hasidim sects and the surname of the current rebbe of both major factions of the Satmar.
FIGURE A3.1. Distribution of Levy in Western Europe, 2012.
FIGURE A3.2. Distribution of Boscawen in Western Europe, 2012.
FIGURE A3.3. Distribution of Bergeron in North America, 2012.
FIGURE A3.4. Distribution of Teitelbaum in the state of New York, 2012.
A limitation of this data set, however, is that in some countries, such as the United States, the surname counts are based on telephone directory listings, so that it overrepresents the frequency of high-status surnames and undercounts low-status su
rnames. More accurate current surname counts for individual countries include the following sources:
United States: The US Census Bureau’s Demographic Aspects of Surnames from Census 2000. This source lists all surnames in the U.S. census of 2000 with one hundred or more occurrences. It also gives the census-reported racial composition (white, black, Asian–Pacific Islander, Native American, and Hispanic) for each surname. The information is available only as a large Excel file.2
United Kingdom: The Office of National Statistics produced a list of surname frequencies for England and Wales in 2002. This source gives all surnames held by at least five people in England and Wales and their frequency. The stock of surnames here represents all surnames in 1998, including any births occurring between 1998 and 2002 but not subtracting deaths in those years. It thus overestimates the total size of the population in 2002.3
Australia: The Intellectual Property Agency of the Australian Government maintains a searchable database of surname frequencies in Australia, based on the electoral register.4 In 2012 there were 14.3 million enrolled electors in Australia, representing 90 percent of all adults. Because of the enrollment requirement, there is a tendency for this site to undercount lower-status surnames. This site can be searched for any string of letters in a surname.
The Son Also Rises Page 30