1 | Longitudinal Data and Longitudinal Designs |
This chapter deals with some of the issues and complexities involved in the collection of longitudinal data. It aims to provide guidance, ideas, and perhaps some sense of confidence to investigators who expect a longitudinal design to help them in obtaining valid answers to their research questions, but are as yet uncertain about the best design for such a study. In this chapter I first distinguish between longitudinal research designs and longitudinal data, showing that the last does not necessary imply the first, and vice versa. After discussing some of the advantages of longitudinal data, seven basic designs for collecting such data are addressed. Finally, I provide a short checklist of the issues to be considered before undertaking a longitudinal study.
Longitudinal data versus longitudinal designs
Basically, longitudinal data present information about what happened to a set of research units (such as people, business firms, nations, cars, etc.) during a series of time points (for simplicity, I will refer to human subjects throughout the remainder of this text). In contrast, cross-sectional data refer to the situation at one particular point in time. Longitudinal data are usually (but not exclusively) collected using a longitudinal research design. The participants in a typical longitudinal study are asked to provide information about their behavior and attitudes regarding the issues of interest at a number of separate occasions in time (also called the âphasesâ or âwavesâ of the study). The number of occasions is often quite small â longitudinal studies in the behavioral and social sciences usually involve just two or three waves. The amount of time between the waves can be anything from several weeks (or even days, minutes, or seconds, depending on the aim of a study) to more than several decades. Finally, the number of participants in the study is usually fairly large (say, 200 participants or over; sometimes even tens of thousands).
Although longitudinal research designs can take on very different shapes, they share the feature that the data describe what happened to the research units during a series of time points. That is, data are collected for the same set of research units for (but not necessarily at) two or more occasions, in principle allowing for intra-individual comparison across time. Note that the research units may or may not correspond with the sampling units. For example, in a two-wave longitudinal study on the quality of the care provided by a childrenâs day care center (the research unit), a different sample of parents (the sampling units) may be interviewed at each occasion. The aggregate of the parentsâ judgements at each time point will allow for conclusions about changes in the quality of the care provided by the center, even if no single parent has been interviewed twice.
As another example, take the consumer panel that is frequently used in marketing research. The participants in such panels provide the researchers on a regular basis with information about their level of consumption of particular brands or products. These levels are then monitored in time. However, the consecutive measurements are usually not matched at the micro-level of households (Van de Pol, 1989). Although this example presents a longitudinal study at the level of the research units (the brands under examination, the levels of consumption of these being followed across time), a series of cross-sectional studies would have given us the same information.
Thus, there is not necessarily a one-to-one correspondence between the design of a study and the type of data collected. The data obtained using a longitudinal research design (involving multiple interviews with the same participants) may be analyzed in such a way that no intra-individual comparisons are made; it may even be pointless to attempt to do so (as in the consumer panel). Conversely, longitudinal data may be collected in a single-wave study, by asking questions about what happened in the past (so-called retrospective questions, see below for a discussion). Although such data are collected at the same occasion, they may cover an extended period of time. As Campbell (1988: 47) argued, âTo define âlongitudinalâ and ârepeated measuresâ synonymously is to confuse the design of a particular study with the form of the data one wishes to obtainâ.
Covariation and causation
A distinction can be made between studies that are mainly of a descriptive nature, and studies that more or less explicitly aim to explain the occurrence of a particular phenomenon (Baltes and Nesselroade, 1979). In descriptive studies, the association (or covariation) between particular characteristics of the persons under study is described. Thus, researchers are satisfied with describing how the values of one variable are associated with the values of other variables. Conclusions in this type of research typically take the form of âif X is the case, Y is usually the case as wellâ, and âmembers of group A have on average more of property X than members of group Bâ. Such statements simply describe what is the case; in a longitudinal context they would tell you what has happened to whom. The strength of the association between variables X and Y can be expressed through association measures such as the correlation coefficient (if both variables are measured on at least ordinal level) or the chi-square value (if both variables are measured qualitatively).
It is often unsatisfactory to observe a particular association without being able to say why this particular association exists. Further, from a practical point of view it is much more helpful to know that phenomenon Y is affected by X, rather than to know that X and Y tend to coincide. Therefore, it is not surprising that much research aims to explain the occurrence of events, to understand why particular events happen, and to make predictions when the situation changes (Marini and Singer, 1988). Stated differently, much research describes the association between pairs of variables in causal terms. It is generally accepted that at least the following three criteria must have been satisfied before a particular association between two variables can be interpreted in causal terms (Blalock, 1964; Menard, 1991).
- Covariation. There must be a statistically significant association between the two variables of interest. It makes little sense to speak of a âcausalâ relationship if there is no relationship at all.
- Non-spuriousness. The association between the two variables must not be due to the effects of other variables. In experimental contexts this is ascertained by random allocation of participants to conditions. If successful, this results in a situation in which there are no pre-treatment differences between the experimental group and the control group, thus ruling out alternative explanations for a post-treatment difference. In non-experimental contexts, the association between two phenomena must hold up, even when other (sets of) variables are controlled. For example, a statistically significant relationship between the number of rooms in oneâs house and the price of the car that one drives will probably fully be accounted for by oneâs income. A statistical association between two variables that disappears after controlling a third variable is called âspuriousâ.
- Temporal order of events. Thirdly, the âcausalâ variable must precede the âeffectâ variable in time. That is, a change in the causal variable must not occur after a corresponding change in the effect variable (but see below).
A fourth criterion is not usually mentioned, perhaps because it is so obvious. Causal inferences cannot directly be made from empirical designs, irrespective of the research design that has been used to collect the data or the statistical techniques used to analyze the data. In non-experimental research, causal statements are based primarily on substantive hypotheses which the researcher develops about the world. Causal inference is theoretically driven; causal statements need a theoretical argument specifying how the variables affect each other in a particular setting across time (Blossfeld and Röhwer, 1997; Freeman, 1991). Thus, causal processes cannot be demonstrated directly from the data; the data can only present relevant empirical evidence serving as a link in a chain of reasoning about causal mechanisms.
The first two criteria (there is a statistically significant association between two variables, that is not accounted for by other variables) can in principle be tested using data from cross-sectional studies. Evidence relevant to the third criterion (cause precedes effect) can usually only be obtained using longitudinal data. Thus, one great advantage of longitudinal data over cross-sectional data would seem that the first provides information relevant to the temporal order of the designated âcausalâ and âeffectâ variables. Indeed, some authors (e.g., Baumrind, 1983) maintain that causal sequences cannot usually be established unambiguously without incorporating across-time measurement. However, there has been some debate whether the causal order of events is accurately reflected in their temporal order (Griffin, 1992): Is it really informative to know the order in which events occurred?
According to Marini and Singer (1988), causal priority may be established in the mind in a way that is not reflected in the temporal sequence of behavior. Willekens (1991) argued that present behavior may be determined by future events (or the anticipation of such events), rather than by these events themselves. For example, one common finding is that women tend to quit their job after the birth of their first child. These two events (leaving the labor market and having a baby) tend to coincide, with empirically occurring patterns in which childbirth both precedes and follows leaving the job. The first sequence would suggest that having a baby âproducesâ a change of labor market status, whereas the second would imply that leaving the labor market leads to childbirth. However, it would seem that both events are the result of anticipations and decisions taken long before the occurrence of either. If this is correct, the temporal order of these events may not say much about their causal relation (Campbell, 1988).
The take-home message is that, although longitudinal data do provide information on the temporal order of events, it still may or may not be the case that there is a causal connection between these events. We still need to develop a more or less explicit theory that spells out the causal processes that produce empirically occurring patterns of events. A cautious investigator will consider these processes before the study is actually carried out â that is, in the design phase: a priori consideration of the possible relations among the study variables may lead them to conclude that other variables must be measured as well.
Designs for collecting longitudinal data
Any study can only be as good as its design. This obvious (albeit often neglected) point applies strongly to longitudinal research, as the design of a longitudinal study must usually be fixed long before the last wave of this study has been conducted. Errors in the design phase may be costly and difficult (if not impossible) to correct â it is awkward to find out afterwards that it would have been very convenient had variable X been measured at the first wave of the study, rather than only at its final wave.
At a more basic level, investigators must decide in advance about the number of waves of their study; whether it is really necessary to measure the variables of interest at different times for the same set of sampling units; and about the number of sampling units for which data should be collected (taking into account that sampling units have the sad tendency to drop out of the study, see Chapter 2). Below I describe seven basic design strategies, all of which are frequently employed in practice (Kessler and Greenberg, 1981; Menard, 1991). Some of these are truly longitudinal, in that they involve multiple measurements from the same set of sampling units; others are not usually thought of as âlongitudinalâ designs.
The simultaneous cross-sectional study
In this type of research, a cross-sectional study involving several distinct age groups is conducted. Each age sample is observed regarding the variables of interest. Although this design does not result in data describing change across time (it is therefore not a truly longitudinal design), it does yield data relevant to describing change across age groups. As such, it may be used to obtain understanding of development or growth across time. Any cross-sectional study in which participant age is measured might be considered an example of this design. However, in a simultaneous cross-sectional study, respondent age is the key variable, whereas in most âstandardâ cross-sectional designs age is just another variable to be controlled.
There are many threats to the validity of inferences based on this type of study. For example, different age groups have usually experienced different historical circumstances, and these may also result in differences among the age groups (this point is elaborated below, in the discussion of the cohort study). Further, in this design, age effects are confounded with developmental effects, because the two concepts are measured with the same variable.
The trend study
In a trend study (which is sometimes also referred to as a ârepeated cross-sectional studyâ), two or more cross-sectional studies are conducted at two or more occasions. The participants in the cross-sectional studies are comparable in terms of their age. Usually a different sample is drawn from the population of interest for each cross-sectional study. In order to ensure the comparability of the measurements of the concept of interest across time, the same questionnaires must be used in all cross-sectional surveys (see also Chapter 3). This type of design is suitable to provide answers to questions like âare adolescents becoming more sexually permissive?â, or âhow does votersâ support for right-wing parties vary across time?â.
In a typical trend study, researchers are not interested in examining change at the individual level (it is impossible to know what happened to whom, assuming that the study did not include retrospective questions). The trend study design is therefore not suited to resolve issues of causal order or to study developmental patterns. Its principal advantage to a true cross-sectional design is that it allows for the detection of change at the aggregate level. Thus, the trend study is a typical instance of a design that is cross-sectional at the level of the sampling units, but longitudinal at the level of the research units.
Time series analysis
In time series analysis, repeated measurements are taken from the same set of participants. The measurements are not necessarily equally spaced in time. In comparison to the two preceding designs, the time series design allows for the assessment of intra-individual change, because the same participants are observed across time. If different age groups are involved in the study, differences between groups with respect to intra-individual development may be examined. The time series design is very general and flexible. The intervention study and the panel study (see below) may be considered as variations on the time series design, involving many participants, many variables, and a limited number of measurements. In contrast, the term âtime series analysisâ is usually reserved for studies in which a very limited number of subjects is followed through time at a large number of occasions and for a small number of variables.
The intervention study
The classic example of an intervention study is the pretestâposttest control group design (Campbell and Stanley, 1963). In this design, there is an experimental and a control group. The effects of a particular intervention (also termed treatment or manipulation) are studied by comparing the pretest and posttest scores of the experimental and the control group. In experimental (laboratory) studies, random assignment of participants to the control and experimental groups ensures that there are no important differences between the groups as regards possible confounding variables. This means that this design is a powerful means of assessing causal relations; if the experimental and the control group were comparable in terms of their pretest scores and participants were randomly assigned to these groups, a difference between the groups on the posttest measurement must be attributed to the experimental manipulation.
In survey research, however, random assignment of participants to experimental and control groups is usually unethical, impractical, or impossible, whereas the occurrence of the manipulation is often beyond the investigatorâs control (compare Chapter 5). Conscience will not let experimenters randomly assign children to experimental and control groups in order to examine the effects of growing up in a one-parent family on, say, substance abuse. In practice, some of the participants experience a particular event during the observed interval (such as the death of their spouse, the separation of their parents, etc.), whereas others do not. It is likely that the âexperimentalâ group (comprising the participants who experienced the event of interest) differed initially from the âcontrolâ group. For example, if the event of interest is the death of a spouse, it would seem likely that the experimental group is on average quite a bit older than the control group. Insofar as such differences are relevant to the research question, they must be statistically controlled in order to ensure valid inferences. This ânon-equivalent control group designâ (Cook and Campbell, 1979) is currently very popular in quasi-experimentation and survey research.
The panel study
In the panel study, a particular set of participants is repeatedly interviewed using the same questionnaires. The term âpanel studyâ was coined by the famous sociologist Paul H. Lazarsfeld when he reflected on the presumed effect of radio advertising on product sales. Traditionally, hearing the radio advertisement was assumed to increase the likelihood that the listeners would buy the corresponding product. Lazarsfeld considered the reverse relationship (people who have purchased the product might notice the advertisement, whereas others would not) plausible as well, casting doubts on the causal direction of this relationship. Lazarsfeld proposed that repeatedly interviewing the same set of people (the âpanelâ) might clarify this issue (Lazarsfeld and Fiske, 1938). However, long before Lazarsfeld, researchers routinely conducted studies involving repeated measurements (for example, in studies on childhood development: Nesselroade and Baltes, 1979; Sontag, 1971). Menard (1991) notes that national censuses have been taken at periodic intervals for more than three hundred ye...