Validity Generalization
eBook - ePub

Validity Generalization

A Critical Review

Kevin R. Murphy

  1. 464 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Validity Generalization

A Critical Review

Kevin R. Murphy

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

This volume presents the first wide-ranging critical review of validity generalization (VG)--a method that has dominated the field since the publication of Schmidt and Hunter's (1977) paper "Development of a General Solution to the Problem of Validity Generalization." This paper and the work that followed had a profound impact on the science and practice of applied psychology. The research suggests that fundamental relationships among tests and criteria, and the constructs they represent are simpler and more regular than they appear. Looking at the history of the VG model and its impact on personnel psychology, top scholars and leading researchers of the field review the accomplishments of the model, as well as the continuing controversies. Several chapters significantly extend the maximum likelihood estimation with existing models for meta analysis and VG. Reviewing 25 years of progress in the field, this volume shows how the model can be extended and applied to new problems and domains. This book will be important to researchers and graduate students in the areas of industrial organizational psychology and statistics.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Validity Generalization est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Validity Generalization par Kevin R. Murphy en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Psychology et History & Theory in Psychology. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2013
ISBN
9781135638344
1
The Logic of Validity Generalization
Kevin R.Murphy
Pennsylvania State University
A few minutes in any college library is enough to illustrate one of the important characteristics of research in the behavioral and social sciences (i.e., that the number of books, papers, chapters, and reports published in these areas is simply enormous). For example, the various journals of the American Psychological Association publish tens of thousands of pages of peer-reviewed studies each year. The sheer volume of published work often makes the task of summarizing, integrating, and making sense of this research daunting. For example, a recent keyword search of the PsychInfo database using the term attitude change returned 1,800 citations. A search using the term psychotherapy returned more than 54,000 citations. In industrial and organizational psychology, a similar phenomenon has been noted, especially in the area of selection test validity. There have been thousands of studies examining the validity and utility of tests, interview methods, work samples, systems for scoring biodata, assessment centers, etcetera (e.g., a PsychInfo using the term personnel selection yielded more than 2,300 citations), and the task of interpreting this body of research is a challenging one.
For much of the history of personnel psychology, the task of interpreting this literature fell on the authors of textbooks and narrative reviews (notably Ghiselli, 1966, 1970). Throughout the 1960s and 1970s, reviews of literature on the validity and utility of tests and other selection methods highlighted two recurrent problems: relatively low levels of validity for tests that seemed highly relevant to the job, and substantial inconsistencies in validity estimates from studies that seemed to involve similar tests, jobs, and settings. This pattern of findings led personnel psychologists to conclude that it would be difficult to predict or determine what sorts of tests might or might not be valid as predictors of performance in a particular job, that the validity of particular tests varied extensively across settings, organizations, etcetera, even when the essential nature of the job was held constant, and that the only way to determine whether a test was likely to be valid in a particular setting was to do a local validity study. The application of meta-analytic methods, and in particular, the validity generalization model to these same validation studies has led some very different conclusions about the meaning of this research. Applications of meta-analysis, and particularly validity generalization analyses, to studies of the validity of tests, interviews, assessment centers, and the like has led to the conclusions that (1) professionally developed ability tests, structured interviews, work samples, assessment centers, and other structured assessment techniques are likely to provide valid predictions of future performance across a wide range of jobs, settings, etcetera, (2) the level of validity for a particular test can vary as a function of characteristics of the job (e.g., complexity) or the organizations, but validities are often reasonably consistent across settings: and (3) it is possible to identify abilities and broad dimensions of personality that are related to performance in virtually all jobs (for reviews of research supporting these points, see Hartigan & Wigdor, 1989; Hunter & Hunter, 1984; McHenry, Hough, Toquam, Hanson, & Ashworth, 1990; Nathan & Alexander, 1988; Ree & Earles, 1994; Reilly & Chao, 1982; Schmidt & Hunter, 1999; Schmidt, Hunter, & Outerbridge, 1986; Schmitt, Gooding, Noe, & Kirsch, 1984; Wigdor & Garner, 1982. For illustrative applications of VG methods, see Callender & Osburn, 1981; Schmidt, Hunter, Pearlman, & Shane, 1979). Schmidt and Hunter (1999) reviewed 85 years of research on the validity and utility of selection methods and concluded that cognitive ability tests, work samples, measures of conscientiousness and integrity, structured interviews, job knowledge tests, biographical data measures and assessment centers all showed consistent evidence of validity as predictors of job performance.
The purpose of this chapter is to discuss the methods used by researchers to study the cumulative literature in areas such as test validity, and in particular, to lay out the logic behind the methods used in research on validity generalization (VG). Research on validity generalization is based on an integration of meta-analysis and psychotric theory, and in order to understand the methods and results of VG research, it is important to examine the method and its logic in some detail.
Methods of Meta-Analysis
The problem of making sense of the outcomes of hundreds or thousands of studies is in many ways similar to the problem of making sense of the data collected in any particular study. For example, if you conduct a study in which 200 subjects each complete some task of measure, the first step in making sense of the data you have collected is often to compute a variety of statistics that both describe that you found (e.g., means, standard deviations) and lend support to inferences you might make about what those data mean (e.g., confidence intervals, significance tests). One of the key insights of methodologists in the 1970s and 1980s was that the same could also be applied to the problem of making sense of a body of research. That is, if you wanted to make sense of the results of 125 different validation studies, each of which reported the correlation between some test and some measure of performance, one thing you would probably do would be to compute the mean and the standard deviation of the validities across studies. Many of the current methods of meta analysis take a more sophisticated approach to the problem than simply computing the average across all studies (e.g., they might weight for sample size), but the starting point for virtually all methods of meta analysis is essentially to compute some descriptive statistics that summarize key facets of the research literature you hope to summarize and understand. Differences in approaches to meta-analysis start to emerge as we move from descriptive statistics (i.e., what happened) to inferential ones (i.e., what does this mean).
The term meta-analysis refers to a wide array of statistical methods that are applied to the outcomes of multiple studies to describe in some sensible fashion what these studies have typically found, and draw inferences about what those findings might mean. Validity generalization represents a specialized application of meta-analysis that attempts to integrate both psychometric and statistical principles to draw inferences about the meaning of the cumulative body of research in a particular area (this method is sometimes also referred to as psychometric meta-analysis). In particular, validity generalization analyses attempt to draw inferences about the meaning of a set of studies, each of which has attempted to draw conclusions about fundamental relationships among the constructs being studied on the basis of imperfect measures, finite samples, and studies that vary on a number of dimensions (e.g., the level of reliability of the measures used).
There are a number of methods of quantitatively summarizing the outcomes of multiple studies, any or all of which might be referred to as meta-analysis. For example, Rosenthal (1984) developed methods of combining the p values (i.e., probability that experimental results represent chance alone) from several independent studies to obtain an estimate the likelihood that the particular intervention, treatment, etcetera has some effect. Glass, McGaw, and Smith (1981) developed methods of combining effect size estimates (e.g. the difference between the experimental and control group means, expressed in standard deviation units) from multiple studies to give an overall picture of how much impact treatments or interventions have on key dependent variables. Schmidt and Hunter (1977) developed methods of combining validity coefficients (i.e., correlations between test scores and criterion measures) from multiple studies to estimate the overall validity of tests and other selection methods. Several variations on the basic VG model proposed by Schmidt and Hunter have been reviewed by Burke (1984) and Hedges (1988). Hedges and Olkin (1985) elaborated a general statistical model for meta-analysis that includes as a special case a variety of procedures similar to those developed by Schmidt and Hunter. Brannick (2001) discussed applications of Bayesian models in meta-analysis (see also Raudenbush & Bryk, 1985). Finally, Thomas (1990) developed a mixture model that attempts to describe systematic differences in validity among specific subgroups of validity studies.
The methods developed by Schmidt and Hunter have been widely applied, particularly within the field of personnel selection. For example, Schmidt (1992) noted that “meta-analysis has been applied to over 500 research literatures in employment selection, each one representing a predictor-job performance pair” (p. 1177). The most frequent application of these methods has been in research on the relationship between scores on cognitive ability tests and measures of overall job performance; representative examples of this type of validity generalization analysis include Pearlman, Schmidt, and Hunter (1980), Schmidt, Gast-Rosenberg, and Hunter (1980) and Schmidt, Hunter, and Caplan (1981). However, applications of metaanalysis and validity generalization analysis have not been restricted to traditional test validity research. Hunter and Hirsh (1987) reviewed meta-analyses spanning a wide range of areas in applied psychology (e.g., absenteeism, job satisfaction). Other recent applications of meta-analytic methods have included assessments of the relationship between personality traits and job performance (Barrick & Mount, 1990), assessments of race effects in performance ratings (Kraiger & Ford, 1985) and assessments of the validity of assessment center ratings (Gaugler, Rosenthal, Thornton, & Bentson, 1987). Finally, Hom, Carnikas-Walker, Prussia, and Griffeth (1992) combined meta-analysis with structural modeling to assess the appropriateness of several competing theories of turnover in organizations.
Validity Generalization: The Basic Rationale
The basic model developed by Schmidt and Hunter (1977) has gone through several developments and elaborations (Burke, 1984; James, Demaree, Mulaik, & Ladd, 1992; Raju & Burke, 1983; Schmidt et al., 1993), and the accuracy and usefulness of the model has been widely debated (e.g., Hartigan & Wigdor, 1989; James, Demaree, & Mulaik, 1986; Kemery, Mossholder, & Roth, 1987; Thomas, 1990). Although there is still considerable discussion and controversy over specific aspects of or conclusions drawn from validity generalization analyses, the core set of ideas in this method are simple and straightforward.
As noted earlier, the problem the validity generalization model was designed to address is that of making sense of research literature in which many, if not most of the relevant studies are of dubious quality. For example, many studies of the validity and utility of selection tests feature small sample sizes or unreliable criteria. Because sampling error leads to random variations in study outcomes and measurement error artificially lowers (i.e., attenuates) validities, it is reasonable to expect that validity coefficients from different studies will seem to vary randomly from study to study and will generally seem small. However, the effects of sampling error and unreliability are both relatively easy to estimate, and once the effects of these statistical artifacts are taken into account, you are likely to conclude that the actual validity of the test or assessment procedure studied is probably both larger and more consistent than a simple examination of the observed validity coefficients would suggest.
For example, suppose that there are 100 studies of the validity of structured interviews as predictors of job performance, and in each study the reliability of the performance measure is .70 and N (i.e., the sample size) is 40. If the average of the observed validity coefficients is .45, the formula for the correction for attenuation suggests that the best estimate of the validity of the these interviews is probably closer to .54 (i.e., .45 divided by the square root of .70) than .45. Thus, a simple correction for measurement error suggests that the interviews are probably more valid than they seem on the basis of a quick examination of the validity studies themselves.
Because each validity coefficient comes from a fairly small sample, it is natural to expect some variability in study results; this variability can be estimated using a well-known formula for the sampling error of correlation coefficients (Hunter & Schmidt, 1990). For example, suppose that the standard deviation of the validity coefficients coming from these 100 studies was .18. On the basis of sampling error alone, you would expect a standard deviation of .12, given an N of 40 and a mean observed validity of .45 (see Hunter & Schmidt, 1990, for a detailed discussion of the formulas used to make such estimates). One conclusion that is likely to be reached in a validity generalization study is that much of the observed variation is test validities is likely to be due to the effects of sampling error rather than to the effects of real variation in test validity (here, 66% of the observed variability in validities might be due to sampling error).
Although the results of various approaches meta-analytic do not always agree (Johnson, Mullen, & Salas, 1995), these methods lead to similar general conclusions about the validity of selection tests, interviews, assessment centers, etcetera. In particular, it seems highly likely that test validities are generally both larger and more consistent across situations than the results of many individual validity studies would suggest (Hartigan & Wigdor, 1989; Schmidt, 1992; see, however, Murphy, 1993). Indeed, given the nature of much of the available validation research (i.e., small N, unreliable measures, range restriction), this general finding is virtually a foregone conclusion, although it directly contradicts one of the most widely held set of assumptions in personnel psychology (i.e., that validities are generally small and inherently unstable). Similarly, applications of the VG model to quantitative reviews of research on the validity of personality inventories as predictors of performance (e.g., Barrick & Mount, 1991; Hough, Eaton, Dunnette, Kamp, & McCloy, 1990; Tett, Jackson, & Rothstein, 1991) has overturned long-held assumptions about the relevance of such tests for personnel selection. Personnel researchers now generally accept the conclusion that scores on personality inventories are related to performance in a wide range of jobs.
The VG model suggests that there are a variety of statistical artifacts that artificially depress the mean and inflate the variability of validity coefficients, and further that the effects of these artifacts can be easily estimated and corrected for. It is useful to discuss two broad classes of corrections separately, corrections to the mean and corrections to the variability in the distribution of validity coefficients that would be found in a descriptive meta-analysis.
Corrections to the Mean
There are several reasons why validity coefficients might be small. The most obvious possibility is that validities are small because the test in question is not a good predictor of performance. However, there are several statistical artifacts that would lead you to find relatively small correlations between test scores and measures of job performance, even if the test is in fact a very sensitive indicator of someone’s job-related abilities. Two specific statistical artifacts that are known to artificially depress validities have received extensive attention in literature dealing with validity generalization, the limited reliability of measures of job performance and the frequent presence of range restriction in test scores, performance measures, or both.
There is a substantial literature dealing with the reliability of performance ratings (Viswesvaran, Ones, & Schmidt, 1996; Schmidt & Hunter, 1996) and other measures of job performance (Murphy & Cleveland, 1995). This literature suggests that these measures are often unreliable, which can seriously attenuate (i.e., depress) validity coefficients. For example, Viswesvaran et al.’s (1996) review showed that the average inter-rater reliability estimate for supervisory ratings of overall job performance was .52. To correct the correlation between a test score (X) and a measure of performance (Y) for the effects of measurement error in Y, you di...

Table des matiĂšres

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright
  5. Dedication
  6. Contents
  7. Series Foreword
  8. Preface
  9. 1. The Logic of Validity Generalization
  10. 2. History, Development, Evolution, and Impact of Validity Generalization and Meta-Analysis Methods, 1975–2001
  11. 3. Meta-Analysis and Validity Generalization as Research Tools: Issues of Sample Bias and Degrees of Mis-Specification
  12. 4. The Status of Validity Generalization Research: Key Issues in Drawing Inferences From Cumulative Research Findings
  13. 5. Progress Is Our Most Important Product: Contributions of Validity Generalization and Meta-Analysis to the Development and Communication of Knowledge in I/O Psychology
  14. 6. Validity Generalization: Then and Now
  15. 7. Impact of Meta-Analysis Methods on Understanding Personality-Performance Relations
  16. 8. The Challenge of Aggregating Studies of Personality
  17. 9. Maximum Likelihood Estimation in Validity Generalization
  18. 10. Methodological and Conceptual Challenges in Conducting and Interpreting Meta-Analyses
  19. 11. Meta-Analysis and the Art of the Average
  20. 12. Validity Generalization From a Bayesian Perspective
  21. 13. A Generalizability Theory Perspective on Measurement Error Corrections in Validity Generalization
  22. 14. The Past, Present, and Future of Validity Generalization
  23. Author Index
  24. Subjects Index
Normes de citation pour Validity Generalization

APA 6 Citation

[author missing]. (2013). Validity Generalization (1st ed.). Taylor and Francis. Retrieved from https://www.perlego.com/book/1666278/validity-generalization-a-critical-review-pdf (Original work published 2013)

Chicago Citation

[author missing]. (2013) 2013. Validity Generalization. 1st ed. Taylor and Francis. https://www.perlego.com/book/1666278/validity-generalization-a-critical-review-pdf.

Harvard Citation

[author missing] (2013) Validity Generalization. 1st edn. Taylor and Francis. Available at: https://www.perlego.com/book/1666278/validity-generalization-a-critical-review-pdf (Accessed: 14 October 2022).

MLA 7 Citation

[author missing]. Validity Generalization. 1st ed. Taylor and Francis, 2013. Web. 14 Oct. 2022.