Chapter 1
Background for Evaluating Research Reports
The vast majority of research reports are initially published in academic journals. In these reports, or empirical journal articles,1 researchers describe how they have identified a research problem, made relevant observations or measurements to gather data, and analyzed the data they collected. The articles usually conclude with a discussion of the results in view of the study limitations, as well as the implications of these results. This chapter provides an overview of some general characteristics of such research. Subsequent chapters present specific questions that should be applied in the evaluation of empirical research articles.
â Guideline 1: Researchers Often Examine Narrowly Defined Problems
Comment: While researchers usually are interested in broad problem areas, they very often examine only narrow aspects of the problems because of limited resources and the desire to keep the research manageable by limiting its focus. Furthermore, they often examine problems in such a way that the results can be easily reduced to statistics, further limiting the breadth of their research.2
Example 1.1.1 briefly describes a study on two correlates of prosocial behavior (i.e., helping behavior). To make the study of this issue manageable, the researchers greatly limited its scope. Specifically, they examined only one very narrow type of prosocial behavior (making donations to homeless men who were begging in public).
Example 1.1.1
A STUDY ON PROSOCIAL BEHAVIOR, NARROWLY DEFINED
In order to study the relationship between prosocial behavior and gender as well as age, researchers located five men who appeared to be homeless and were soliciting money on street corners using cardboard signs. Without approaching the men, the researchers observed them from a short distance for two hours each. For each pedestrian who walked within ten feet of the men, the researchers recorded whether the pedestrian made a donation. The researchers also recorded the gender and approximate age of each pedestrian.
Because researchers often conduct their research on narrowly defined problems, an important task in the evaluation of research is to judge whether a researcher has defined the problem so narrowly that it fails to make an important contribution to the advancement of knowledge.
â Guideline 2: Researchers Often Conduct Studies in Artificial Settings
Comment: Laboratories on university campuses are often the settings for research. To study the effects of alcohol consumption on driving behavior, a group of participants might be asked to drink carefully measured amounts of alcohol in a laboratory and then âdriveâ using virtual-reality simulators. Example 1.2.1 describes the preparation of the cocktails in a study of this type.
Example 1.2.13
ALCOHOLIC BEVERAGES PREPARED FOR CONSUMPTION IN A LABORATORY SETTING
The preparation of the cocktail was done in a separate area out of view of the participant. All cocktails were a 16-oz mixture of orange juice, cranberry juice, and grapefruit juice (ratio 4:2:1, respectively). For the cocktails containing alcohol, we added 2 oz of 190-proof grain alcohol mixed thoroughly. For the placebo cocktail, we lightly sprayed the surface of the juice cocktail with alcohol using an atomizer placed slightly above the juice surface to impart an aroma of alcohol to the glass and beverage surface. This placebo cocktail was then immediately given to the participant to consume. This procedure results in the same alcohol aroma being imparted to the placebo cocktail as the alcohol cocktailâŠ
Such a study might have limited generalizability to drinking in out-of-laboratory settings, such as nightclubs, the home, picnics, and other places where those who are consuming alcohol may be drinking different amounts at different rates while consuming (or not consuming) various foods. Nevertheless, conducting such research in a laboratory allows researchers to simplify, isolate, and control variables such as the amount of alcohol consumed, the types of food being consumed, the type of distractions during the âcar rideâ, and so on. In short, researchers very often opt against studying variables in complex, real-life settings for the more interpretable research results typically obtained in a laboratory.
â Guideline 3: Researchers use Less-than-perfect Methods of Measurement
Comment: In research, measurement can take many formsâfrom online multiple-choice achieve -ment tests to essay examinations, from administering a paper-and-pencil attitude scale with choices from âstrongly agreeâ to âstrongly disagreeâ to conducting unstructured interviews to identify intervieweesâ attitudes.4 Observation is a type of measurement that includes direct observation of individuals interacting in either their natural environments or laboratory settings.
It is safe to assume that all methods of observation or measurement are flawed to some extent. To see why this is so, consider a professor/researcher who is interested in studying racial relations in society in general. Because of limited resources, the researcher decides to make direct observations of White and African American students interacting (and/or not interacting) in the college cafeteria. The observations will necessarily be limited to the types of behaviors typically exhibited in cafeteria settings â a weakness in the researcherâs method of observation. In addition, observations will be limited to certain overt behaviors because, for instance, it will be difficult for the researcher to hear most of what is being said without intruding on the privacy of the students.
On the other hand, suppose that another researcher decides to measure racial attitudes by having students respond anonymously to racial statements by circling âagreeâ or âdisagreeâ for each one. This researcher has an entirely different set of weaknesses in the method of measurement. First is the matter of whether students will reveal their real attitudes on such a scale â even if the response is anonymous â because most college students are aware that negative racial attitudes are severely frowned on in academic communities. Thus, some students might indicate what they believe to be socially desirable (i.e., socially or politically âcorrectâ) rather than reveal their true attitudes. Moreover, people may often be unaware of their own implicit racial biases.5
In short, there is no perfect way to measure complex variables. Instead of expecting per -fection, a consumer of research should consider this question: Is the method sufficiently valid and reliable to provide potentially useful information?
Examples 1.3.1 and 1.3.2 show statements from research articles in which the researchers acknowledge limitations in their methods of measurement.
Example 1.3.16
RESEARCHERSâ ACKNOWLEDGMENT OF A LIMITATION OF THEIR MEASURES
In addition, the assessment of marital religious discord was limited to one item. Future research should include a multiple-items scale of marital religious discord and additional types of measures, such as interviews or observational coding, as well as multiple informants.
Example 1.3.27
RESEARCHERSâ ACKNOWLEDGMENT OF LIMITATIONS OF SELF-REPORTS
Despite these strengths, this study is not without limitations. First, the small sample size decreases the likelihood of finding statistically significant interaction effects. [âŠ] Fourth, neighborhood danger was measured from mothersâ self-reports of the events which had occurred in the neighborhood during the past year. Adding other family member reports of the dangerous events and official police reports would clearly strengthen our measure of neighborhood danger.
Chapter 8 provides more information on evaluating observational methods and measures typically used in empirical studies. Generally, it is important to look for whether the researchers themselves properly acknowledge in the article some key limitations of their measurement strategies.
â Guideline 4: Researchers use Less-than-perfect Samples
Comment: Arguably, the most common sampling flaw in research reported in academic journals is the use of convenience samples (i.e., samples that are readily accessible to the researchers). Most researchers are professors, and professors often use samples of college students â obviously as a matter of convenience. Another common flaw is relying on voluntary responses to mailed surveys, which are often quite low, with some researchers arguing that a response rate of about 40â60% or more is acceptable. For online surveys, it may be even more difficult to evaluate the response rate unless we know how many people saw the survey solicitation. (Problems related to the use of online versus mailed surveys are discussed in Chapter 6.)
Other samples are flawed because researchers cannot identify and locate all members of a population (e.g., injection drug users). Without being able to do this, it is impossible to draw a sample that a researcher can reasonably defend as being representative of the population.8 In addition, researchers often have limited resources, which forces them to use small samples and which in turn might produce unreliable results.
Researchers sometimes explicitly acknowledge the limitations of their samples. Examples 1.4.1 through 1.4.3 show portions of such statements from research articles.
Example 1.4.19
RESEARCHERSâ ACKNOWLEDGMENT OF LIMITATION OF SAMPLING (CONVENIENCE SAMPLE)
The present study suffered from several limitations. First of all, the samples were confined to university undergraduate students and only Chinese and American students. For broader generalizations, further studies could recruit people of various ages and educational and occupational characteristics.
Example 1.4.210
RESEARCHERSâ ACKNOWLEDGMENT OF LIMITATION OF SAMPLING (LOW RATE OF PARTICIPATION)
Data were collected using a random sample of e-mail addresses obtained from the universityâs registrarâs office. The response rate (23%) was lower than desired; however, it is unknown what percentage of the e-mail addresses were valid or were being monitored by the targeted student.
Example 1.4.311
RESEARCHERâS ACKNOWLEDGMENT OF LIMITATION OF SAMPLING (LIMITED DIVERSITY)
There are a number of limitations to this study. The most significant of them relates to the fact that the study was located within one school and the children studied were primarily from a White, working-class community. There is a need to identify how socially and ethnically diverse groups of children use online virtual worlds.
In Chapters 6 and 7, specific criteria for evaluating samples are explored in detail. Again, it is important to look for statements in which researchers honestly acknowledge limitations of sampling in their study. It does not mitigate the resulting problems but can help researchers properly recognize some likely biases and problems with the generalizability of their results.
â Guideline 5: Even a Straightforward Analysis of Data can Produce Misleading Results
Comment: Obviously, data-input errors and computational errors are possible sources of errors in results. Some commercial research firms have the data they collect entered independently by two or more data-entry clerks. A computer program checks to see whether the two sets of entries match perfectly â if they do not, the errors must be identified before the analysis can proceed. Unfortunately, taking such care in checking for mechanical errors in entering data is hardly ever mentioned in research reports published in academic journals.
In addition, there are alternative statistical methods for most problems, and different methods can yield different r...