TWO
What is meant by âevaluationâ?
Later in this chapter we shall be considering various definitions or interpretations of the concept âevaluationâ in order to see how it differs from other forms of assessment in the public sector. âInterpretationâ is probably the more accurate term to apply since this allows for nuances of meaning which are attached to the concept, rather than âdefinitionâ which implies that there is only one meaning. However, âdefinitionâ will be used at appropriate points in the discussion, because writers on this topic usually offer their interpretations as if they are indeed the meaning.
As noted in Chapter One, during the 1980s and 1990s in the UK the government needed to ensure that public expenditure was allocated to services on the basis of value for money â in short, where was the evidence that projects, programmes and, in particular, policies had achieved their objectives at the least possible cost? Some of these objectives could be expressed in terms of outcomes but it was also considered essential to concentrate efforts on attaining quantifiable targets which were labelled âperformance indicatorsâ (PIs). Semantically, the term âindicatorâ would seem to be well-chosen. An indication of achievement implies that the chosen indicator might not tell the whole story; that there might only be a prima facie connection between some quantifiable index and âperformanceâ. The temptation has been to use PIs not only as proxy measures of successful policies, but as if they were the only measurements that were necessary. We shall now deal with a number of approaches to and techniques of performance measurement and, later in this chapter, we shall examine whether there is consensus about the nature and purpose of evaluation.
Measurement of performance
How does evaluation differ from other forms of assessment in the realm of public sector performance? Are the criteria for judging merit that are applied in auditing, inspection and accreditation less rigorous or less valid than those used when carrying out formal evaluations? Are the results recorded in audit, inspection and accreditation reports accepted more readily by decision makers than those generated by evaluation research? If they are, is it because the data they set out to gather is more superficial and more easily accessed than the information sought by evaluators? We look first at the extremely quantifiable PIs: those devices that governments assume will provide a clear answer to the question: Are government-prescribed targets â as proxy measures of service quality â being achieved by those organisations responsible for delivering services?
Performance indicators
A relatively early attempt at defining evaluation came from Weiss (1972) who referred to it as âjudging merit against some yardstickâ (p 1) while, even earlier, Scriven (1967) identified two kinds of evaluation: formative and summative. Before turning to a whole plethora of evaluation âtypesâ and âmodelsâ which form the content of numerous academic publications, the succinct definition offered by Weiss prompts the question of how evaluation differs from other attempts to judge merit against some yardstick. It all depends on what is meant by âmeritâ. It also depends on the purpose of the formal assessment. Weissâs definition relates only to the process of evaluation. It is fairly easy to define the physical process of âwalkingâ but the decision to walk rather than use another form of transport might have several different purposes such as keeping fit, helping to save the environment or saving money. In the context of public sector services various methods have been implemented in order to assess performance depending on the key purpose of the judgement. The fact that PIs hinge on quantifiable measures of worthiness or value does not detract from their use as a key piece of equipment in the evaluation toolkit.
This is particularly relevant when they feature in sets of comparative data in which performance is displayed alongside similar organisations, departments or in comparison with previous annual statistics. Annual house-building figures, incidence of specific diseases across countries and UK regions, deaths from road accidents, clear-up rates for certain criminal offences â all these statistics can tell policy makers about vital matters relating to efficiency and cost-effectiveness. They rely, of course, on the sources of data being wholly reliable. Yet, to err on the cynical side perhaps, any yardstick can be used in order to judge merit. Exactly what the resulting data shows us could, however, be challenged on economic, ethical and even mathematical grounds.
For example, adapting the methodology applied by NICE, a reduction in instances of football hooliganism over a certain period appears highly commendable but at what cost was this result achieved? This additional yardstick of cost adds another dimension to the slogan âwhat works?â The policy or practice might be worthy but was it worth it? As the economist would suggest: could the resources needed to achieve very commendable figures be used to greater effect elsewhere â what were the âopportunity costsâ involved? In addition, could the same results have been achieved by adopting a different strategy which involved the use of fewer resources? Taking this broader view, it is obvious that PIs are very useful as âtin openersâ (Carter et al, 1992) because they can and, arguably should, be used in order to clarify important questions about what they are actually telling us about âperformanceâ as construed from different perspectives.
This is the value of PIs at their best in the context of service evaluation. They can act as the starting point for closer and deeper analysis that is intended to provide answers to what might be called âthe so-what? questionâ. Here are some examples of statements that might be encountered:
- Independent schools produce more Oxbridge entrants than state schools.
- The number of students obtaining A grades in exams has been increasing year on year.
- Recently, girls have been out-performing boys at pre-university level examinations.
We can leave readers to draw their own collection of prima facie inferences from these bald statements. At worst, those with a particular point to make can interpret such statements according to their own version of âtub-thumpingâ, thus fulfilling the old adage that âstatistics can be used like a drunken man uses a lamp-post: for support rather than for illuminationâ (Lang, quoted in the Wordsworth Dictionary of Quotations, 1997, p 277). Unfortunately, PIs are frequently used to support a particular standpoint or ideological premise. This abuse of data is at its most malign when it purports to make comparisons between public sector organisations such as hospitals, schools and social communities. The manipulation of unscrutinised numbers can lead to unethical judgements being made about professional competence and human behaviour.
The mathematical fallacy arises when quantitative data are averaged in order to create and promote official government targets that carry sanctions if they are not met. Witness the case of a hospital in Stafford which, in 2010, in slavishly seeking to reach externally-imposed âperformanceâ targets, failed to provide anything like good quality care to very vulnerable patients (Triggle, 2010). While the culture of target-driven public services is still very much alive, at least in England, within the NHS, there has been a move towards emphasising quality above quantity (Darzi, 2008). In summary, performance indicators can be a legitimate starting point for service evaluation but cannot and should not be the finishing point.
Before offering further definitions of evaluation, it will be helpful to consider whether other approaches to measuring performance in the public sector can be characterised as different forms of evaluation.
Audits
In the public sector all organisations wholly or partly funded by central government are subject to regular auditing. Auditors appointed by the government are tasked with inspecting the organisationsâ accounts in order to ensure that the proper procedures have been carried out in accounting for income and expenditure on behalf of the public, who in essence foot the bill. In 1983, following the Local Government Act of 1982, and at the behest of a Conservative administration, the Audit Commission was established, The Commissionâs role was to monitor public sector performance by applying the three âEâsâ as the main criteria. These were economy, efficiency and effectiveness. Thus, the remit of the Commission was much wider than that of the auditing of accounts and entered the territory of value for money (VFM) (Simey, 1985; Whynes, 1987).
However, in August 2010 the newly elected Coalition government under a Conservative prime minister announced the dissolution of the Audit Commission, as those in power considered it to be wasting taxpayersâ money on what were deemed to be unnecessary projects. This decision was a truly classic example of Juvenalâs enquiry: âQuis custodiet ipsos custodeos?â (âWho will guard the guards themselves?â). The Audit Commission was, one might say, informally audited and found not to be providing value for money. Scotland, Wales and Northern Ireland have retained their Audit Offices with the primary aim of ensuring value for money in the public sector but also having the responsibility to monitor the accounts of private companies if they have received public money.
The increase in auditing by central agencies during the 1980s and 1990s has been described by Power (1997) as an âaudit explosionâ. In Powerâs view, traditional auditing focused on financial regularity and legality of transactions but recently there has been a greater emphasis than previously on efficiency, effectiveness and performance. However, this explosion, Power argues, has had some perverse unintended effects including:
- a decline in organisational trust
- defensive strategies and âblamismâ
- a reduction in organisational innovation
- lower employee morale
- the hidden costs of audit compliance as organisations enga...