- 122 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
About This Book
• Your students will get valuable practice in interpreting actual excerpts from published test manuals.
• Each of the 39 exercises begins with a guideline that helps students review the measurement concepts they will need in order to complete the exercise.
• Background notes on each exercise describe the purpose of the test from which the excerpt was drawn.
• Students answer questions that require them to locate and interpret important points in the excerpt.
• The excerpts are largely unabridged so that students practice interpreting material as it is actually presented by test makers.
• The skills they learn with this book can be easily transferred to other test manuals they may be using in the future.
• Students have an ethical responsibility to be thoroughly familiar with the technical characteristics of the tests they will use. This book prepares them for this responsibility.
• All major topics are covered, including:
· validity
· reliability
· standard error of measurement
· norm group composition
· derived scores
· scales to detect faking
· item analysis
· cultural bias
• The excerpts are drawn from tests such as:
· Wechsler Intelligence Scale for Children
· Peabody Picture Vocabulary Test
· 16PF
· Stanford Binet Intelligence Scale
· MMPI
· Beck Depression Inventory
· Stanford Achievement Test Series
· KeyMath
· and many others!
Frequently asked questions
Information
Exercise 1
Test-Retest Reliability
Guideline
Background Notes
Excerpt from the Manual
BRP-2 Scale | r |
Parent Rating Scale | 84 |
Teacher Rating Scale | 91 |
Student Rating Scales: Home | 78 |
Student Rating Scales: School | 83 |
Student Rating Scales: Peer | 86 |
- Which one of the scales is the most reliable? Explain.
- Which one of the scales is the least reliable? Explain.
- In your opinion, are all the scales adequately reliable? Explain.
- The excerpt presents the results of only one of a number of reliability studies described in the manual for the BRP-2. In your opinion, is this one study sufficient or are others needed? Explain.
- In Table 4.3, decimals have been omitted. If they were not omitted, what would the reliability coefficient be for the Parent Rating Scale?
- The test-retest reliability coefficients are based on a two-week interval. Do you think the coefficients would be higher or lower if a two-month interval had been used? Explain.
- Speculate on why test makers usually allow an interval of a week or two between the two administrations of the test instead of giving the same test twice in a row at one sitting.
- If you were considering using this instrument, what other types of reliability coefficients, if any, would you like to see in the manual? Explain.
- In general, how important is test-retest reliability information for selecting a scale or test? Would you consider it a serious flaw if a manual did not contain information on this topic? Explain.
- If you have a measurement textbook, do the authors suggest a minimum acceptable value for a test-retest reliabi lity coefficient? If yes, what is it? If yes, do all of the coefficients in the excerpt exceed the minimum value?
Exercise 2
Interscorer Reliability
Guideline
Background Notes
Excerpt from the Manual
- Why was scorer agreement examined for only some of the WPPSI-R subtests?
- Cases were selected at random. What is random selection?
- Cases were selected from all cases collected for the standardization. What do you think the "standardization" is?
- Is it important to know that the research scorers were trained and given practice in scoring the subtests? Explain.
- How many scorers scored the cases in each age group? In your opinion, is this an adequate number?
- The responses had been previously scored. Is it important to know that the research scorers were not allowed to see the previous scoring notations? Why? Why not?
- Is it important to know that the research scorers did not see each other's scores? Why? Why not?
- On which subtest was the interscorer reliability the lowest? Explain.
- Overall, do you think that the interscorer reliability is adequate? Explain.
Table of contents
- Cover
- Title
- Copyright
- Table of Contents
- Introduction
- 1. Test-Retest Reliability Behavior Rating Profile
- 2. Interscorer Reliability Wechsler Preschool and Primary Scale of Intelligence
- 3. Internal Consistency and Test-Retest Reliability Occupational Aptitude Survey and Interest Schedule
- 4. Internal Consistency Reliability (Cronbach's Alpha) The Sixteen Factor Personality Questionnaire
- 5. Concurrent Validity and Test-Retest Reliability Reading and Arithmetic Indexes (12)
- 6. Concurrent Validity Thurstone Test of Mental Alertness
- 7. Predictive Validity Wechsler Intelligence Scale for Children
- 8. Content Validity: I Test of Written Language
- 9. Content Validity: II Boehm Test of Basic Concepts
- 10. Construct Validity: I Comprehensive Receptive and Expressive Vocabulary Test
- 11. Construct Validity: II Gray Oral Reading Tests
- 12. Construct Validity: III Beck Depression Inventory
- 13. Percentile Ranks Test of Pragmatic Language
- 14. Stanines Flanagan Aptitude Classification Test
- 15. IQ Scores Wechsler Intelligence Scale for Children
- 16. Derived Scores and the Normal Curve Peabody Picture Vocabulary Test
- 17. Grade Equivalents KeyMath
- 18. Age Equivalents Vineland Adaptive Behavior Scales
- 19. Norm Group Composition: I The Adaptive Behavior Evaluation Scale
- 20. Norm Group Composition: II The Sixteen Personality Factor Questionnaire
- 21. Standard Error of Measurement: I Peabody Picture Vocabulary Test
- 22. Standard Error of Measurement: II Behavior Dimensions Scale-School Version
- 23. Standard Error of Measurement and Alternate-Forms Reliability Stanford Achievement Test Series
- 24. Significance of Intra-Ability Difference Scores Gray Oral Reading Tests
- 25. Use of a Bias Review Panel Personality Assessment Inventory
- 26. Pretesting Items to Reduce Bias SRA Pictorial Reasoning Test
- 27. Procedures to Eliminate Bias Stanford Achievement Test Series
- 28. Scales for Detecting Faking Tennessee Self-Concept Scale
- 29. Experiment on Faking Survey of Interpersonal Values
- 30. Social Desirability Scale The Sixteen Personality Factor Questionnaire
- 31. Item Omissions and Validity Minnesota Multiphasic Personality Inventory
- 32. Lie Scale Minnesota Multiphasic Personality Inventory
- 33. Item Analysis: I The Attention Deficit Disorders Evaluation Scale-Home Version
- 34. Item Analysis: II Comprehensive Receptive and Expressive Vocabulary Test
- 35. Equivalence of Editions Self-Directed Search
- 36. Presenting Intelligence Test Items Stanford-Binet Intelligence Scale
- 37. Testing Conditions Minnesota Multiphasic Personality Inventory
- 38. Establishing Rapport During Test Administration Woodcock-Johnson Tests of Cognitive Ability
- 39. Responsibility for Test Security Woodcock Language Proficiency Battery
- Appendix A Review of Basic Statistics