CHAPTER ONE

Evaluating the Fairness and Effectiveness of an Admissions System

The grounds on which [an admissions] decision is based may seem arbitrary and capricious to one observer, while to another they may seem natural reflections of values deeply and sincerely held. In any case there are few guidelines, and the scope for disputation is vast.
—B. Alden Thresher
In 2001 the Regents of the University of California approved a new policy requiring the “comprehensive review” of student applications to undergraduate programs. UC president Richard C. Atkinson applauded the change, claiming it would enhance the university’s ability to select “a class of thoroughly qualified students who demonstrate the promise to make great contributions to the university community and to the larger society beyond.”1 Gone was the requirement that each UC campus admit 50% to 75% of its entering class based solely on academic factors such as grades, test scores, and completion of college preparatory courses. Instead, all student records were to be judged in terms of 14 criteria, which consisted of 10 academic factors and four “supplemental” factors. These supplemental criteria, while not new to the UC admissions process, assumed a much greater role under comprehensive review. Most controversial was the requirement that academic accomplishments be evaluated “in light of an applicant’s experiences and circumstances,” which included “low family income, first generation to attend college, need to work, disadvantaged social or educational environment, [and] difficult personal and family situations.”2
In 2003 John Moores, then the chair of the UC Board of Regents, authored a scathing report charging that this admissions process resulted in the selection of significant numbers of poorly qualified students, “perhaps at the expense of extraordinarily well-qualified applicants.”3 In a subsequent opinion piece in Forbes, he asserted that 359 students were accepted to UC Berkeley in 2002 with total SAT scores below 1000 (roughly the national average at the time) and that about two-thirds of these low scorers were Black, Hispanic, or Native American.4 At the same time, “some 1,421 Californians with SAT scores above 1,400 applying to the same departments … were not admitted. Of those, 662 were Asian-American.” This finding led Moores to ask, rhetorically, “How did the university get away with discriminating so blatantly against Asians?” His answer: the “fuzzy factors” used in comprehensive review.5
Moores’s report ignited a firestorm. The UC Board of Regents voted to censure its own chair; the chancellor of UC Berkeley, Robert Berdahl, accused Moores of undermining confidence in Berkeley’s admissions practices and of subjecting admitted students to “derision”;6 and a group of Berkeley professors joined with civil rights advocates to challenge the report’s heavy emphasis on SAT scores.7 But Moores had his supporters too, ranging in prominence from Ward Connerly, UC Regent and staunch foe of affirmative action, to the unknown head of a test preparation firm, who told the San Francisco Chronicle, “John Moores has received a lot of flak … for one reason: Berdahl and the UC system are scared to death that he will reveal the sloppiness of the new UC admissions system.”8
Overlooked in much of the uproar was the fact that the University of California never claimed that comprehensive review would result in the selection of students with the strongest scholastic credentials, as traditionally defined. After all, maximizing the high school grades or test scores of the entering class could be easily achieved using a computer and would hardly require the detailed consideration of applications mandated by comprehensive review. The comprehensive review process was intended to take into account “accomplishments beyond the classroom that illustrate qualities such as leadership, intellectual curiosity, and initiative.”9 And in fact, the admissions process was very much in line with an explicit policy of the University of California Regents: to “seek out and enroll” a student body that is not only talented but “encompasses the broad diversity of backgrounds characteristic of California.”10
The Moores debacle provides a useful starting point for defining the concepts of fairness and effectiveness of admissions. We need some rough working definitions of these terms, to be explored and refined throughout this book. I use “effectiveness” to refer to the degree to which admissions policies and procedures achieve their intended goals. In the Moores case, for example, one stated goal of UC admissions was to enroll an entering class with a “diversity of backgrounds characteristic of California.” At least in principle, the degree to which diversity goals are attained can be addressed through statistical analysis. Did the ethnic and socioeconomic makeup of the entering class mirror that of the population of California? Was the demographic composition different from what it had been in previous years?
Fairness pertains to whether the goal itself and the means through which it is implemented are ethical and just. For example, should admissions procedures incorporate compensation for past or present social injustice? Is achieving diversity a legitimate goal of university admissions policy? Is it fair to consider a candidate’s contribution to diversity when evaluating his application? If so, how can a candidate’s contribution to diversity be properly evaluated? Is it ever fair to use decision criteria that vary across groups? Obviously, data analyses alone cannot answer these questions. One measure of the depth and complexity of the controversy surrounding fair admissions practices is the fact that, in cases spanning more than 40 years, the Supreme Court has yet to offer a comprehensive, unambiguous ruling on the legitimacy of race-based preferences in admissions.
Collectively, fairness and effectiveness roughly correspond to what is called validity in the educational and psychological testing field. In discussing admissions testing, validity expert Michael Kane recently observed that “the claims made for admissions-testing program[s] go beyond accurate prediction [of success] and involve … assumptions about the overall effectiveness of the program in promoting the goals of the institution and broader social goals. In particular, selection programs generally assume that the attributes evaluated by the testing program involve skills / competencies that are needed for effective performance and are not simply correlated with the [success] measure, that the assessment procedures and the [success] measure are free of any identifiable sources of bias, … and that they have consequences that are, in general, positive for the institution and society.”11 A tall order! In this book, I consider many of the same issues, but with reference to the entire enterprise of college admissions, not only the admissions testing component.
The effectiveness of the selection criterion is a key factor in evaluating its fairness. It is possible to argue that a particular screening criterion is fair if it serves a legitimate goal of the selection process, even if its impact falls disproportionately on certain demographic groups. However, if a criterion has a disproportionate effect and is also demonstrably invalid for its intended purpose, it must be judged unfair.
Consider an example from outside the world of college admissions: the English literacy tests to which some American voters were subjected until they were curtailed by the Voting Rights Act of 1965 and finally banned permanently in 1975.12 These tests had the indisputable effect of preventing disproportionate numbers of minority-group members from voting. However, defenders argued that the tests served a legitimate purpose—to ensure that voters had the skills necessary to understand the voting process—and therefore, the disproportionality of their effects did not invalidate them. In Lassiter v. Northampton Election Board, a 1959 Supreme Court case addressing this issue, the Court held in favor of a North Carolina literacy test, arguing that “the ability to read and write … has some relation to standards designed to promote intelligent use of the ballot.… Literacy and intelligence are obviously not synonymous. Illiterate people may be intelligent voters. Yet, in our society, where newspapers, periodicals, books, and other printed matter canvass and debate campaign issues, a State might conclude that only those who are literate should exercise the franchise.”13
However, the Supreme Court came to an opposite conclusion in Katzenbach v. Morgan, a 1966 case that addressed the legality of literacy tests that prevented large numbers of Puerto Ricans living in New York City from voting. These tests had been prohibited by the Voting Rights Act of 1965, but a lawsuit was filed arguing that the Voting Rights Act itself was unconstitutional. In the Court’s opinion supporting the prohibition of the tests, Justice William Brennan noted that one of several arguments Congress could have considered was that “as a means of furthering the intelligent exercise of the franchise, an ability to read or understand Spanish is as effective as ability to read English for those to whom Spanish language newspapers and Spanish language radio and televisions programs are available to inform them of election issues and governmental affairs.”14 In other words, not only was the impact of the tests falling disproportionately on certain groups—in this case, native Spanish speakers—but the tests were not functioning effectively as a device for distinguishing poorly informed and well-informed voters. Today it is, of course, widely acknowledged that literacy tests were thinly disguised attempts to prevent Blacks and immigrants, among others, from voting. Both the discriminatory nature of literacy tests and their ineffectiveness for their alleged purpose were succinctly summarized in a 1970 Supreme Court case by Justice William O. Douglas, who noted that Congress had “concluded that such tests have been used to discriminate against the voting rights of minority groups and that the tests are not necessary to ensure that voters be well-informed.”15
In recent years some of the SAT’s most vehement critics have made a strikingly parallel argument about the SAT. “A more reliable way to disguise social selection as academic merit has not been invented,” according to sociologist Joseph A. Soares, who goes on to argue that the SAT doesn’t predict college grades but does correlate with socioeconomic status.16 (We will return to the specifics of these controversial claims later on.) And according to journalist Peter Sacks, “the SAT has proven to be a vicious social sorter of young people by class and race, and even gender—and has served to sustain the very upper-middle-class privilege that many of the exam’s supporters claim to oppose.”17
Similar but less strident language appeared in Facts and Fantasies about UC Berkeley Admissions, the rejoinder to the Moores report prepared by a group of Berkeley faculty members and an array of other entities, including civil rights groups, the testing watchdog organization FairTest, and the Princeton Review Foundation, which is associated with the test-coaching company of the same name. Facts and Fantasies decried Moores’s faith in SAT scores as an index of academic talent, describing the SAT as “an effective tool of social stratification at Berkeley” and noting that the “wealth preference” and racial gaps on the SAT are significantly more extreme than on other admissions criteria.18 Furthermore, according to Facts and Fantasies, the SAT is “a weak predictor of grades” and “has virtually no value in predicting graduation rates” at Berkeley.19 Here again we see the connection between effectiveness and fairness: Heavy reliance on the SAT (the policy attributed to Moores) is claimed to be unfair because of the alleged dual failings of disproportionate impact on certain race and income groups and limited utility for identifying successful students.
This linkage is embedded in the federal legal principles known as disparate impact law. For purposes of evaluating admissions criteria, the particular laws that have historically been invoked are Title VI of the Civil Rights Act of 1964, which prohibits discrimination on the basis of race, ethnicity, and national origin, and Title IX of the Education Amendments of 1972, which prohibits gender discrimination. These statutes apply to programs that receive federal financial assistance.
As described in a briefing report prepared for the Regents of the University of California in 2008, “there is a three-part test for assessing disparate impact complaints. A violation of law may occur if: 1) There is a significant disparity in the provision of a benefit or service that is based on race, national origin or sex; and 2) The practice at issue does not serve a substantial legitimate justification (i.e., is not educationally necessary); or 3) There is an alternative practice that is equally effective in meeting the institution’s goals and results in lower disparities.”20
Although the Civil Rights Act prohibits only intentional discrimination, the federal regulations that have been used to interpret it have, on occasion, allowed selection practices to be challenged in court if they created race-based differences in outcomes, even if no discrimination was intended.21 According to a 2015 legal analysis of the application of Title VI in educational contexts, “a prohibition on disparate impact presumptively in...