I
Background
1
Assessment: An Overview
John Rosenbaum
Ithaca College
The purpose of this chapter is to provide a broad context for discussions about approaches to outcomes assessment. It begins with a brief history of assessment movements in higher education since the early 1900s. Next the actors involved in five forums are examined: federal agencies, accrediting bodies, national and regional organizations, the states, and institutions. The ongoing debate between assessmentâs advocates, detractors, worriers, skeptics, and critics is summarized. Three alternative approaches to pencil-and-paper testing are offered: authentic, research, and quality assessment. The chapter concludes with five lessons learned about successful outcomes assessments: (a) begin by clarifying the goals and values of education, (b) analyze practices as well as performances, (c) use multiple approaches, (d) involve everyone, especially faculty, and (e) foster communication.
INTRODUCTION
What is outcomes assessment? Is it the evaluation of student learning we do every day as part of our teaching? Is it accountability for resources that we receive? Is it program review? Self-study? Accreditation? At different times, in different places, for different reasons, outcomes assessment has been all those things and more (see Banta, 1988, 1993b; Johnson, Prus, Anderson, & El-Khawas, 1991).
When the Presidential Task Force on Student Learning and Development at Kean College of New Jersey began exploring the field in the mid-1980s, it concluded that the term outcomes assessment has had no single, universally accepted definition (Presidential Task Force on Student Learning and Development, 1986, p. 12). The same year, administrators and faculty members at Harvard University began setting up an assessment program and concluded that outcomes assessment asks three questions: âWhat do students know now? How much do they gain when theyâre here? And how can we evaluate the effectiveness of what we do now with an eye toward constantly improving it?â (A Conversation, 1991).
The questions answered in assessment depend on several factorsâincluding its purpose, the actors, the timetable, the institutional commitment, and the resources allocatedâbut the most important is the educational philosophy driving the enterprise. âDifferences in meaning and understanding of the word assessment tend to be philosophical ones,â according to Terenzini (1993), who explained that, âSome view assessment as a public policy vehicle, in that it provides a public accounting for the expenditure of public funds, and as having both political and educational dimensions. Others see no direct links to funding issues and consider the primary purpose of quality assessment to be the enhancement of student learningâ (p. 4). On the one hand, outcomes assessment may take the form of an end productâa summative report to an outside agency with the primary purpose of accounting for allocated resources. On the other hand outcomes assessment may be approached as a process of diagnostic, formative, evaluative activities infused into all aspects of learning with the primary purpose of continuous improvement.
Whether the purpose is summative, formative, or a little of each, in planning an assessment methodology is an issue. Will it be quantitative, qualitative, or a combination of both? Will it be normative, criterion-referenced, or descriptive? Ultimately, no one approach will fit all assessments. That was the conclusion of 12 experts on outcomes assessment invited to a series of six discussions sponsored by the American Association for Higher Education (AAHE) Assessment Forum between 1989 and 1992. The participants agreed that, âThere is no one best way of conducting an assessmentâŚbut effective practices do have features in commonâ (Hutchings, 1993, p. 6) . In 1992 they issued the document Principles of Good Practice for Assessing Student Learning (1993), listing nine principles common to all outcomes assessment efforts:
The assessment of student learning begins with educational values.
Assessment is most effective when it reflects an understanding of learning as multidimensional, integrated, and revealed in performance over time.
Assessment works best when the programs it seeks to improve have clear, explicitly stated purposes.
Assessment requires attention to outcomes but also and equally to the experiences that lead to them.
Assessment works best when it is ongoing, not episodic.
Assessment fosters wider improvement when representatives from across the educational community are involved.
Assessment makes a difference when it begins with issues of use and illuminates questions that people really care about.
Assessment is most likely to lead to improvement when it is part of a larger set of conditions that promote change.
Through assessment, educators meet responsibilities to students and to the public.
Erwin (1991) pointed out that the term assessment was used in the fields of industrial psychology and psychometric testing before it took on a life of its own in higher education: âOut of this background, the term assessment has proliferated in higher education. In a sense, its meaning today depends on oneâs role as state official, accreditor, faculty member, student affairs staff, or studentâ (p. 15). In order to include all perspectives in this discussion, the term outcomes assessment is broadly defined. Following Erwin (1991) and Marchese (1987), as used in this chapter, outcomes assessment refers to âthe process of defining, selecting, designing, collecting, analyzing, interpreting, and using information to increase studentsâ learning and developmentâ (Erwin, 1991, p. 15). This process includes using the information for program review and improvement, as well as to increase student learning.
The purpose of this chapter is to provide a broad context for the specific issues and various approaches to outcomes assessment that are examined in subsequent chapters of this book. The chapter begins with a brief history of assessment movements in higher education. Then it examines the actors involved, the ongoing debate, and some alternative approaches. The chapter concludes with a list of resources for additional information.
HISTORY
Outcomes assessmentâwhether used for summative or formative purposesâis not a recent trend in higher education. University of Virginia provost Hugh P. Kelly noted that assessment efforts began at his institution more than 150 years ago (Kelly, 1990 p. 1). Many other provosts can make the same statement, as the practice of assessment was typical in 19th-century U.S. colleges. Degree candidates routinely demonstrated their knowledge and speaking ability in senior declamations, often in public. These performances were intended to display the sum total benefit of the college experience. They were the outcomes assessment instruments of the day.
By the end of the 19th century, a combination of the elective system, the growth in the number of courses, and the larger numbers of students made it difficult to administer individual comprehensive senior exams. Atypical response was to make evaluation part of each course that students took. Therefore, when students passed the required number of courses, they received their degrees without the hurdle of senior declamations. However, not everyone agreed with this change. Some educators entered the new century arguing that more than âcreditâ was needed for the degree.
Assessment movements in higher education since the turn of the century fall roughly into three periods: (a) 1918 to 1952, (b) 1952 to 1975, and (c) 1975 to the present. Each period has been marked by surprisingly similar expansions, complaints, demands, studies, and responses.
1918 to 1952
From 1918 to 1952 the percentage of 18 to 24-year-olds enrolled in higher education almost doubled, from 3.6% to 7.1% (Sims, 1992). This growth was accompanied by complaints about overcrowding, inadequate abilities of students, chaotic curricula, and lack of assessment. There followed the requisite studies, such as the Pennsylvania Study in 1928, the Cooperative Study in General Education from 1939 to 1944, and the Cooperative Study of Evaluation in General Education in 1950. This period saw the birth of national standardized testing by organizations such as Educational Testing Service (ETS) in 1948. Institutional responses included the revival of comprehensive exams and revised curricula, such as the General College of the University of Minnesota in 1932 and the Basic College of Michigan State University in 1944.
1952 to 1975
From 1952 to 1975 there was an even greater increase in the percentage of 18 to 24-year-olds enrolled in higher education, to about 40.5%. The number of accredited colleges and universities increased from about 2,000 to more than 3,000 (Sims, 1992).
In addition to the growth in numbers of students, another factor that made this a period of dramatic change were the new types of students attending college. The student body grew older, more diverse, and more vocal. These students demanded relevant courses and curricula. The widespread response was to reduce common core requirements and replace them with distribution requirements and elective courses. This response to student demands was influenced by a rising tide of consumerism in education and increasing competition among institutions.
This period of rapid expansion and change was accompanied by massive federal and state expenditures under programs such as the National Defense Education Act of 1957, the Higher Education Facilities Act and the Vocational Education Act of 1963, the Higher Education Act of 1965, and the Amendments to the 1965 Higher Education Act in 1972. It was common for all the federal agencies to require some form of evaluation of the projects they funded.
This activity spawned a number of assessment minimovements during the late 1960s and early 1970s. One was the educational accountability movement (Hogan, 1992). Bowen (1974) said the demand for accountability resulted from a lack of confidence that a college education was worth its increasing cost. Educational accountability added economic considerationsâsuch as efficiency, cost-benefit analysis, and consumer protectionâ to evaluation, which primarily looked at educational effectiveness. âAccountability seems to be concerned more with end results and less with process or means, has more a financial and efficiency focus, is more of a public operation (like an audit by an external agency), and carries a greater implication of finalityâof hard judgments about total programs (rather than of trying to improve on existing ones)â (Peterson, 1971, p. 16). From this point of view, outcomes assessment was a reporting instrument used only to ensure accountability.
However, it was argued that educational accountability could be more than simply an end report. Lessinger (1970), who has been called the father of educational accountability, said that, âIn its most basic aspect, the concept of educational accountability is a process designed to insure that any individual can determineâŚif the schools are producing the results promised. âŚLike most processes that involve balancing of inputs and outputs, educational accountability can be implemented successfully only if educational objectives are clearly stated before instruction startsâ (p. 4). I return to this discussion of accountability later in the chapter.
Another minimovement during this period was the value-added approach. Like educational accountability, value-added analysis derived from economics and focused on the increased value of raw materials after they go through a production process. âThe notion of value-added derived from this modelâŚinvolves acceptance of the production process analogyâ (Ewell, 1987, p. 7). In this process âassessments needed to account for inputs (studentsâ background, prior achievement, aptitude, and interests) and for college effects (the curriculum and extracurriculum) to determine outcomes (progress, persistence, performance, and learning)â (Ratcliff & Jones, 1993, p. 256). In value-added assessment an institutionâs quality was based on the degree of change that it made in its students and not on their performance level (Astin, 1987).
A different concept of assessment promoted during the late 1960s and early 1970s was institutional vitality. McGrath and his colleagues at Teachers College, Columbia University, defined institutional vitality as âfirst, educational effectiveness in terms...