āArnold J. Toynbee
The extent to which findings and conclusions from evaluations of programs are used to inform decision making has received substantial attention in the literature.1 There is considerable debate on the extent of utilization that has occurred or does occur.2 And there are frequent laments that utilization is at best limited and considerable advice offered as to how utilization can be increased.
This chapter examines evaluation utilization in organizations, using the extensive experience gained in the Canadian federal government with evaluation units in departments. The chapter has two purposes: to present data and insight gained from evaluation practice in Canadian federal departments and agencies and to propose a conceptual supply/demand model to explain the nature and extent of evaluation utilization in organizations. The model suggests that evaluation utilization in an organization is best enhanced by evaluators proactively balancing the demand for evaluative information with its supply.
Evaluation Utilization
It is useful to distinguish between two basic types of evaluation and their use. Historically, evaluation was a component of social science research and much evaluation practice continues in this tradition. Utilization of much of this type of evaluation is often referred to as enlightenment utilization (Weiss 1977; Bulmer 1982); that is, over time the cumulative effects of scientific findings have an impact on our understanding of interventions in society and thereby affect the kinds of programs introduced to deal with social problems. Without denying the value of such research-type evaluation, it is not the subject of interest here. Rather, we are looking at evaluation that has as its aim a more immediate effect on programs,3 what we will call āprogram evaluationā as opposed to āevaluation research.ā4
A program evaluation is commissioned by an organization with the aim of determining how well a program is working, and whether it needs to be improved or abandoned. Evaluation research, in contrast, is usually ācommissionedā by the researcher doing the evaluation. Accumulated learning and enlightenment over time can and do occur as a result of program evaluations, as in most fields of investigation and research. But accepting this as adequate utilization in a field that is not basic research is unsatisfactory. Expectations for program evaluation are and should be much more demanding; we should aim for the performance of the program evaluated to actually improve as a result.
We are particularly interested in organizations that have an internal evaluation (Love 1991) capacity. In these cases, the organization has established a unit in the organization to undertake evaluation of its programs. The expectations for these evaluations must be program improvement, the organization expects to learn from evaluation and adapt itself accordingly. The organization has accepted, at least in part, the need to be self-evaluating (Wildavsky 1985), undertaking either single- or double-loop learning, that is, either modifying existing program operations or revising the objectives of programs.5 While not every evaluation study need make a significant impact, over a period of time one should expect that internal evaluation has played a significant role in program improvement. Rist (1990) points out that utilization of evaluation can take place throughout the life of a program, reflecting formative, process, and outcome types of evaluation. In all cases, however, the intention is still the improvement of the program. Most of the data presented later comes from process, and outcome evaluations.
It is clear that utilization of evaluations in an organization can occur in a number of ways, and there is no universally agreed typology of use. For the purpose of this chapter, utilization of program evaluations is broken down into two basic types, program use and organizational use. Program use relates to individual evaluation studies while organizational use refers to the cumulative effect on the organization of routinely carrying out evaluations over time. These terms are first defined and then discussed.
Program use (instrumental) can be defined as occurring when there is a documented instance of a specific program change made as a result of the evaluation (modifications to operations, significant reforms, or termination) or the program is explicitly confirmed as it is.
Program use (conceptual) can be defined as occurring when there is cumulative evidence that while no specific change is made in the program, still as a result of the evaluation there is better information, increased understanding of and/or better reporting on the program and its performance; in short, intellectual capital has been created.
Organizational use can be defined as occurring when there is longitudinal evidence that a cumulative effect of the evaluation function exists in the organization over time, resulting in a more results-oriented approach to management and planning, including more and better use of program performance information.
Instrumental use, as defined here, requires a specific implemented decision about the program made as a result of the evaluation.6 It would also include changes to other programs that resulted from the specific evaluation study. In addition, it is used here to cover the case where the evaluation demonstrates that the program is working well, no (significant) improvements need be made and the program is consciously reconfirmed. Conceptual use covers the cases where, for whatever reason, decisions about the program are not forthcoming but the findings have been considered and valuable information and insight has been acquired about the specific program that is likely to be used in the future. It covers a more nebulous type of utilization and in the absence of instrumental use, most evaluations are likely to claim conceptual use. However, where possible, it is useful to distinguish it from the case where no meaningful utilization occurs from the evaluation: it is just put on the shelf and forgotten. Note that, as used here, conceptual use obviously occurs in all the program use cases.
Organizational use would occur after an organization has been evaluating its programs for some time. The repeated focus on identifying, measuring, and reporting on program objectives, results, and performance, and the involvement of line managers in the evaluations have resulted in an (increased) acceptance of the usefulness of monitoring and evaluating program performance and of a results-orientation in the management of the organization. This type of organizational impact would imply that the culture of the organization has changed to one more accepting of the need for empirical evidence on the results of programs to manage well. Single- and double-loop learning are becoming part of the organizational culture. For both instrumental and organization use, it is recognized that both the process of evaluation, that is, the carrying out of evaluation activities per se, as well as the findings, conclusions, and recommendations from evaluations can lead to change.
Evaluation utilization is difficult to demonstrate. In trying to attribute to the evaluation the changes that subsequently occur, one is faced with a classic evaluation problem (Smith 1988: 16). Evaluations do not take place in a controlled setting and there are numerous other factors at work which also can legitimately lay claim to ācausingā the resulting changes. Program managers, in particular, may wish to claim that they were going to make the changes anyway and that the evaluation did not play a significant role. Indeed, it is usually the case that change will occur in an organization only if there are other factors at work in addition to the evaluation, with the evaluation playing at best a catalyst role. Lindblom and Cohen (1979) have aptly discussed this reality. Most decisions in organizations are based on ordinary knowledge. Knowledge from social science research usually only supplements this.7
There are also serious measurement problems. One can undertake a case study to try to determine the impacts of a particular evaluation and the role played by the evaluation. This in itself is a challenging task given the variety of types of utilization that can be sought. But often there is more interest in a global picture of evaluation utilization over time in an organization. Providing detailed case studies on each evaluation usually is not practical. More global assessments of the impacts of evaluation, however, mean the attribution problem is not addressed.
The result is that it is usually not possible to confirm definitively the ātrueā impact of evaluation in an organization. Of course, this is the situation normally faced by evaluators. We can at best produce reasonable evidence on utilization and thereby increase the understanding of evaluation practice in an organization. If programs are being improved subsequent to evaluation, many will claim the credit. Determining the value of evaluation utilization will help to clarify those claims.