Chapter 1
Assessment Reform: Promises and Challenges
Nidhi Khattri
Pelavin Research Institute
David Sweet
U.S. Department of Education
Developing non-multiple-choice methods of assessing student performance has become a major, albeit controversial, part of the education reform movement currently sweeping the nation. Knowledgeable individuals on both sides of the assessment controversy have put forth arguments for and against performance assessmentâarguments that are even more salient today than they were only two years ago, as the call for assessment reform has attained what can only be called a bandwagon status. In educational circles, the term performance assessment has, in fact, become a buzzword for change.
With the passage of the Goals 2000: Educate America Act, the assessment of student performance is, in many states, coming to the forefront of education reform. It is likely that the country will witness a proliferation of non-multiple-choice, performance-based assessments to be used not only for pedagogical, but also for accountability and certification purposes. The new legislation and the ongoing discussion about the various facets of education reform, including assessment reform, underscore the fact that we are witnessing a period of education ferment. It thus has become increasingly important to address, with intellectual and practical seriousness, questions regarding the purposes of assessments, the contexts in which assessments are implemented, their linkages to systemic reform ventures, and their technical qualities.
This introduction summarizes the history of the performance assessment movement, outlines the relationship of assessment reform to broader reform issues, and highlights the technical questions being raised about the assessments themselves. Much like the proposed assessments, the picture brought into focus is multidimensional, complex, and messy. The remaining chapters clarify and elaborate upon some of the more pressing concerns only touched upon in this chapter.
A BRIEF HISTORY OF THE PERFORMANCE ASSESSMENT MOVEMENT
Performance assessment is not an entirely new assessment strategy in American education. Essays, oral presentations, and other kinds of projects always have been a feature of elite private education; and in many classrooms, private and public, teachers for a long time have assessed student progress through assigned papers, reports, and projects that are used as a basis for course grades. On the national level, the Advanced Placement Program of the College Board from its inception has assessed students by requiring at least one written essay in addition to responses to multiple-choice questions (as well as laboratory experiments in the sciences and demonstrations in music).
What is new in the current reform movement is its emphasis on the use of performance assessments for systematic, school-wide, instructional and curricular purposes and its spread into accountability and certification. In many instances, in fact, proponents of performance assessments view assessments themselves as the lever for systemic curricular and instructional reforms at any level of the educational hierarchy. Theoretical writings, such articles by Wiggins (1989, 1991), and descriptions of programs, such as Wolfâs (1989, 1991) discussions of activities in ARTS PROPEL in the Pittsburgh Public Schools, have had an enormous influence in this regard, especially on practitioners.
As discussed in other sections of this chapter, the controversy centers, not around the use of assessments for primarily pedagogical purposes, but around their use for accountability and certificationâso called âhigh stakesâ purposes. The chapter by Daniel P. Resnick and Lauren B. Resnick also details the functions of educational measurement.
Performance assessment, as the term currently is being used, refers to a range of approaches to assessing student performance. These new approaches are variously labeled as follows:
- Alternative assessment is intended to distinguish this form of assessment from traditional, fact-based, multiple-choice testing;
- Authentic assessment is intended to highlight the real world nature of tasks and contexts that make up the assessments; and
- Performance assessment refers to a type of assessment that requires students to actually perform, demonstrate, construct, develop a product or a solution under defined conditions and standards.
Regardless of the term used, according to Mitchell (1995), performance assessments imply â. . . active student production of evidence of learningânot multiple-choice, which is essentially passive selection among preconstructed answersâ (p. 2).
Thrust for Reform
The present focus on performance assessments as a systematic strategy of public education reform owes its origins to three related phenomena, all gaining momentum during the late 1980s: (a) the reaction on the part of educators against pressures for accountability based upon multiple-choice, norm-referenced testing; (b) the development in the cognitive sciences of a constructivist model of learning; and (c) the concern on the part of the business community that students entering the workforce were not competent enough to compete in an increasingly global economy.
In 1983, A Nation at Risk was widely interpreted as a clarion call for school systems to tighten their curricula, and such tightening resulted in widespread testing for accountability. Most school systems came to rely upon the use of norm-referenced, multiple-choice tests for school accountability, and this phenomenon came to have a considerable amount of influence on teaching and learning in the classroom. Classroom teachers felt the pressure to prepare their students to do well on such tests and accordingly modified their approach to teaching. âTeaching to the test,â thus, became an increasingly popular pedagogical strategy.
Multiple-choice tests were based on a behaviorist model of educationâon the assumption that learning of almost any kind occurs in small increments, from simple to complex ideas and skills; and that discrete aspects of knowledge could be decontextually tested. The inadequacies (and, from many educatorsâ viewpoint, pernicious effects) of such testing models were subsequently highlighted by research (e.g., Oakes, J., 1985, 1990; Cannell, 1987, 1989) causing many educators to rethink their accountability strategies.
Concurrent with such trends within the education system, the demands from outside the education system for more sophisticated thinking skills provided the fuel for the rebellion against the widespread use of multiple-choice tests. Many reformers argued, then, that multiple-choice, norm-referenced testing had assumed a disproportionate importance in the classroom, often displacing other, more pedagogically sound, practices in assessing for teaching in favor of teaching for testing.
At the same time, insights from the constructivist model of cognition began to transform educatorsâ thinking about teaching and learning. According to this model, learning takes place when new information or experience is absorbed into or transforms preexisting mental schemata. The mind seeks to make sense of new information by relating it to prior information, thus establishing the meaning of new information within the context of old information. Furthermore, the model postulates, the search for meaning may motivate individuals to acquire further knowledge and skills. Thus, the following corollary related to this view of learning simultaneously gained currency in the reform movement: Because an individual constructs knowledge in his or her own way, a customized rather than a mass approach to education is necessary to enable him or her to achieve high standards.
Educators came to believe that, in order to strengthen all studentsâ educational experiences and to better meet all studentsâ needs, assessments that concurrently allow for an understanding of studentsâ learning processes and knowledge base and that support variations in pedagogy are required. In addition, advocates of performance assessments suggest that the use of performance assessments will have salutary effects on student motivation and learning; because performance assessments stress interdisciplinary skills and use contextualized assignments (i.e., assignments that mimic the kinds of multifaceted problems one encounters outside the classroom), students are more likely to be involved in attempting and completing these assessments.
Add to these trends the voices of business and industry executives demanding that their employees be able to think creatively, solve problems, write well, work flexibly, and possess social competencies to be able to operate in groups. The Secretaryâs Commission on Achieving Necessary Skills (SCANS), after an extensive survey of the business community reported, âEmployers and employees share the belief that all workplaces must work smarterâ (italics added, p. v). SCANS concluded that for a workplace to work smarter, its employees must possess certain competencies, such as interpersonal skills, and foundation skills, such as basic skills in reading and writing and thinking. Such pressures added up to the widespread consideration of assessment reform as part of a solution to the problem of the incompetent worker.
Given this ammunition, education reformers insisted that, in order to function as a lever of education reform, assessments must: (a) be based on a generative view of knowledge; (b) require active production of student work (not passive selection from prefabricated choices); and (c) consist of meaningful tasks, rather than of what can be easily tested and easily scored. What follows are the different types of assessments that meet one or more of these requirements.
Current Performance Assessments
Performance assessments in use today can roughly be characterized as follows:
- Portfolios that consist of collections of a studentâs work and developmental products, which may include drafts of assignments;
- On-demand tasks, or events, that require students to construct responsesâeither writing or experimentsâto a prompt or to a problem within a short period of time. These tasks are akin to short demonstration projects;
- Projects that last longer than on-demand tasks, and are usually undertaken by students on a given topic and used to demonstrate their mastery of that topic;
- Demonstrations that take the form of student presentations of project work; and
- Teachersâ observations that gauge student classroom performance, usually designed for young children, and primarily used for diagnostic purposes.
All performance assessments require students to structure the assessment task, apply information, construct responses, and, in many cases, explain the process by which they arrive at the answers. (Performance assessments are never multiple-choice; but, many states [e.g., Kentucky, Maryland, Vermont] combine multiple-choice tests with performance assessments.) Student answers on performance assessments are rated using agreed upon rating criteria and standards, usually in the form of scoring rubrics, by groups of scorers or raters or by individual teachers.
In theory, this process generates a wealth of information about the student that can be used for instructional purposes. Such information might shed light on the studentâs understanding of the problem, his or her involvement with the problem, his or her approach to solving the problem, and his or her ability to express himself or herself. In sum, proponents argue that these assessments will motivate and involve students in the learning process itself; performance assessments will help students establish a meaningful context for learning, develop writing and conceptual skills, and, therefore, achieve higher levels of desired outcomes.
PREVALENCE OF THE PERFORMANCE ASSESSMENT MOVEMENT
A review of the prevalence of performance-based student assessment strategies is perhaps best organized by their level of initiation: national, state, district, or school. Although this taxonomy is, in some ways, artificial, it nonetheless helps us to impose order on and to understand better an otherwise unwieldy situation.
National Level
National nongovernmental and governmental involvement in assessment reform shares the limelight with state-level efforts. Several nongovernmental projects tackling assessment, curricular, and instructional reform have gained national prominence in recent years. For example, the New Standards Project (NSP) and the Coalition of Essential Schools (CES) have exerted considerable influence on education administrators and teachers across the nation and prompted a shift to performance-based assessments.
The NSP began in 1991, with the aim of reinvigorating and revamping American education (Resnick & Simmons, 1993). The crux of NSPâs work involves establishing performance standards and designing curricular, instructional, and assessment strategies. The NSP Board, which guides the formulation of performance standards and assessment strategies, is composed of representatives from NSPâs partner states and districts and from professional organizations, such as the National Council for Teachers in Mathematics (NCTM), the American Association of the Advancement of Science (AAAS), and the National Council of Teachers of English (NCTE). The NSP program lists 17 state and 6 urban district partners.
The NSP assessment system is being formulated for Grades 4, 8, and 10. The fully articulated system will consist of student portfolios that will contain NSP recommended matrix-sampled tasks requiring extended responses, exhibitions, projects, and other student work. The NSP piloted a number of its assessment tasks in 1992, 1993, and 1994, in its partner states and districts. Classroom teachers and content area specialists scored these pilot tests, using established scoring rubrics at national scoring conferences. NSP projected that the first valid, reliable, and fair exams would be available for use in mathematics and in English language arts by 1994â1995, in applied learning by 1995â1996, and in science by 1996â1997.
The CES also is a national force in its own right. It was established in 1984, at Brown University, as a school-university partnership to help redesign schools. Coalition members include 150 schools that are actively involved in reform.2 The reform work of the member schools is guided by a set of nine Common Principles, the sixth of which pertains to assessment. The sixth principle states that students should be awarded a diploma only upon a successful demonstrationâan exhibitionâof having acquired the central skills and knowledge of the schoolâs program. As the diploma is awarded when earned, the schoolâs program proceeds with no strict age grading and with no system of earned credits by time spent in class. The emphasis is on the studentsâ demonstration that they can do important things (The Common Principles of the Coalition of Essential Schools; see Sizer, 1989). Several member schools, like Walden III in Racine, Wisconsin, and Capshaw Middle School, in Santa Fe, New Mexico, have fashioned their graduation requirements on this principle.
Performance assessments on the national level have always been a feature of the College Boardâs Advancement Placement (AP) Program, especially the Studio Art Portfolio Evaluation, which has no written or multiple-choice portions. The Evaluation, in fact, is an example of a well-established national portfolio examination (Mitchell, 1992).
Now, the College Board has launched another assessment development effort. The College Boardâs Pacesetter program is being designed as a national, syllabus-driven examination system for all high school students, modeled on the AP examinations, which (with the exception of Studio Art) contain both multiple-choice and partially open-response items. The Pacesetter design incorporated two forms of assessmentsâclassroom assessments, scored by teachers trained to Pacesetter standards, and end-of-course a...