1
Introduction
Marcel Das
CentERdata and Tilburg School of Economics and Management
Tilburg University
Tilburg, the Netherlands
Peter Ester
Rotterdam University
Rotterdam, the Netherlands
Lars Kaczmirek
GESISāLeibniz Institute for the Social Sciences
Mannheim, Germany
Technological developments are opening up unique new possibilities for empirical research in the social sciences. Among these developments is the fast diffusion of Internet research. The Internet is rapidly developing into a major instrument of data collection for the social sciences.
The fast development in Internet research can be explained by three factors: the increasing market share compared to other survey modes, the publicās increasing access to the Internet, and the costs associated with this method of data collection. The market share has steadily increased over the years. For example, the share for online interviewing grew from 5% in 2002 to 31% in 2008 for the generated turnover of member institutes of the German market research association (ADM, 2008). With respect to Internet use, about two thirds of surveyed European Union citizens use the Internet for personal use. āThe proportion of citizens who had used the Internet for personal purposes in the past three months ranged from 41% in Romania to 91% in Denmark . ā¦ Other countries at the higher end of the ranking were Sweden, the Netherlands, Luxembourg, Finland, and the UKā (European Commission, 2008, p. 15). And finally, in terms of costs, according to the Global Prices Study by ESOMAR (2007), online research is the least expensive mode of data collection in most of Western Europe, the United States, Japan, and Australia. The results of a comparison of 592 participating research agencies show that in most cases online research is about three quarters of the price of telephone interviewing, which is often about three quarters of the cost of face-to-face interviewing.
Internet interviewing bears several similarities to other kinds of interviewing such as paper-and-pencil interviewing (PAPI), computer-assisted personal interviewing (CAPI), and computer-assisted telephone interviewing (CATI). In many ways it can be seen as a combination (or an extension) of these conventional interview modes. However, data collection via the Internet offers a number of advantages over traditional methods. Besides being less expensive, the Internet offers the possibility of graphical or animated presentation, such as a display of probabilities through pie charts or exploding scales. Furthermore, for some subpopulations, the response rate to an Internet survey is higher than to a traditional survey: Very busy or active people are more willing to do an interview at the time and place of their choosing. Moreover, the possibility of dividing interviews into short sections and spreading these over longer periods reduces respondent burden and actually allows for the collection of much more information from a respondent than would otherwise be possible. In addition, due to the absence of an interviewer, Internet surveys are less subject to social desirability bias. And finally, data collected via the Internet can be made available very quickly, even in real time.
This book brings together leading social scientists from both Europe and the United States to share experiences on Internet survey research and to report and discuss data collection through the Internet. The objectives of this book are to exchange views on the representativeness of Internet-based methods, to discuss how to reach difficult target groups, and to explore innovative applications of and promising experiments with Internet surveys (e.g., visual displays, interactive features, biomarkers, eye tracking, and noninterview data). Attention is also paid to mixed-mode designs, context effects, usability, the setup of Internet panels, and ethical considerations in Internet surveys.
The book project was facilitated through GESISāLeibniz Institute for the Social Sciences (Mannheim, Germany), a leading European research and service center on innovative survey methods, together with CentERdata (Tilburg, the Netherlands), one of the main academic Internet survey institutes in Europe. GESIS facilitated a number of expert meetings in Mannheim, inviting scholars from both sides of the Atlantic. During two workshops participating researchers presented papers on their experiences with Internet surveys, shared research results, and exchanged views on the future of the Internet as a prime source for data collection in the social sciences. CentERdata conducted a number of small-scale experiments using its innovative Internet panel. Several chapters in this book are based on these experiments.
The main target groups of the book are academic and professional survey researchers, graduate students, and market researchers who use the Internet for data collection. Moreover, the book can be used in graduate courses on data collection methods and Internet use. What distinguishes this book from other sources on Internet surveys is the collaboration of leading Internet survey experts from around the world. The book includes the most pressing and current issues in Internet survey methodology. It contains comprehensive reviews and assessments on traditional issues such as strengths and weaknesses of Internet surveys, mixed-mode approaches, representativeness, and questionnaire design. Apart from theoretically grounding the topics, these chapters ensure high applicability by including specific guidance on how to do things right and best-practice examples when conducting Internet surveys. Other chapters focus on current issues (ethics, interactive probing, open-ended versus closed questions, paradata, hard-to-reach groups, context effects, usability) and promising new developments (probability-based Internet panels, eye tracking, biomarkers).
The book consists of three parts. Part I provides an overview of Internet survey research methodology, its strengths and challenges, and best practices. Part II focuses on Internet survey design, describing advanced methods and applications, and Part III discusses problems and solutions with respect to data quality and provides research insights into new promising applications.
Part I starts with a chapter by Jolene D. Smyth and Jennie E. Pearson that reviews the state of the art in Internet surveys and provides a snapshot of the current climate of Internet survey methodology. It starts with a brief history of Internet surveys and an assessment of their current uses, including why the Internet is such an appealing tool for conducting surveys. The strengths and weaknesses of Internet surveys are outlined with respect to the four main sources of survey error: coverage, sampling, measurement, and nonresponse error. The chapter also presents innovative ways researchers have begun to attempt to minimize each of these sources of error in Internet surveys, and discusses potential directions for future research and development.
The chapter by Edith D. De Leeuw and Joop J. Hox provides a theoretical background on mixed-mode designs for online research with an empirical knowledge base on the implications of mixed mode for data integrity, questionnaire design, and analysis. Mixed-mode designs promise better coverage and higher response rates but may lead to problems of data integrity, as the goal in mixed-mode studies is to combine data from different sources. The chapter discusses central issues for equivalence of measurement, gives an empirical review of existing studies comparing Internet surveys with traditional data collection methods, and focuses on data integrity issues. Furthermore, survey researchers can benefit from the guidance it provides about incorporating a mixed-mode approach in their own surveys.
Annette C. Scherpenzeel and Marcel Das describe the methodology to set up a panel that combines the technology of Internet surveys with a ātrueā longitudinal and probability-based sampling design. Not many long-running scientific panels in Europe and the United States are online panels; most use face-to-face or telephone interviews to collect data. But online interviewing has become a widespread method for access panels and volunteer panels. However, these panels often lack what is scientifically required for true longitudinal and probability-based panels. Nevertheless, the authors show that it is possible to combine the scientific standards for a true longitudinal and probability-based panel with the advantages of online interviewing as method of data collection. The chapter explains how such a panel can be built and maintained, taking the Dutch LISS panel as an example.
Annette C. Scherpenzeel and Jelke G. Bethlehemās chapter shows how the sampling and coverage problems from which many Internet panels suffer can be avoided or corrected. Many online panels rely on self-selection by respondents in constructing and maintaining their sampling frame. Therefore estimates can be seriously biased. Advanced adjustment weighting procedures to improve the quality of survey estimates are discussed. The LISS panel is an illustration of how a probability sample and traditional recruitment procedures can be used to build and maintain an Internet panel. The authors analyze how closely the LISS panel resembles the population using statistical information available at Statistics Netherlands.
In the final chapter of Part I, by Eleanor Singer and Mick P. Couper, the focus is on ethical requirements relating to informed consent in Internet surveys, specifically relating to data security and the collection of paradata. Internet surveys routinely capture user metrics such as browser characteristics, response latencies, changes in answers, and so on. Partial data from break-offs are also often used in analyses. Respondentsā rights to make informed decisions requires disclosure about the collection of paradata, especially where respondents are identifiable. The authors argue that it is possible to frame this information so as not to adversely affect the research process and tested their assumptions in an experiment. Advice is presented together with a review of the existing literature on ethical standards in Internet surveys.
Part II, on advanced methods and applications of Internet surveys, opens with a chapter by Vera Toepoel and Don A. Dillman on the visual design of answer formats and the impact of visual design on the interpretability of survey questions. The chapter begins with a review of the accumulated knowledge of the last decade on how visual layout of questions in surveys influences respondent answers. The authors then systematically develop the theoretical background of processing visual information before continuing with a literature review covering challenges and solutions in everyday questionnaire design. The chapter also identifies additional issues that need to be addressed and ends with thirteen dos and donāts with regard to visual design in survey questions.
Lars Kaczmirekās chapter deals with attention and usability in Internet surveys and its connection to how respondents answer survey questions. The author explains how concepts drawn from usability theory can be applied and tested in survey research. More specifically, an experiment presented in this chapter applied the concept of feedback to the question-answering process and tested the impact of different types of feedback on data quality. The results show that tailoring feedback to the answering process in a set of rating scales can improve data quality in Internet surveys.
Marije Oudejans and Leah Melani Christian, in their chapter, show how Internet surveys can utilize various types of design features to increase the interaction with respondents. They examine how the interactive nature of Internet surveys can help improve the quality of responses to open-ended questions. In an experiment the authors show how motivational statements and interactive probing influence whether people provide a response to open-ended questions, and explore how this affects the quality of responses. Overall, survey design can benefit greatly from the interactive design possibilities in Internet surveys.
Peter Ester and Henk Vinken analyze the role of Internet surveys in studying highly sensitive social issues. Here, the absence of an interviewer is one of the greatest advantages of self-administered Internet surveys. The anonymous interview setting diminishes social desirability bias effects and has an advantage in revealing underlying motivations and feelings compared to traditional survey methods. Attitudes toward controversial issues are related to question ordering (e.g., consistency and contrast effects) and to open-ended versus closed question measurements. The chapter discusses the main outcomes of an Internet survey study on controversial issues with respect to different ordering of open-ended and closed questions.
Part III, on data quality and new research strategies, focuses on sample selection, measurement and nonresponse error, and new approaches for collecting and understanding online survey data.
Corrie M. Vis and Miquelle A. G. Marchand address the vital issue of underrepresentation of certain groups in Internet panel research. People with a lower income, people living in urbanized areas, and single people are examples of groups who are hard to reach in survey research. Internet panel surveys are no exception and create even higher thresholds for participation. The authors discuss why some groups are underrepresented in panels and which measures may be effective to reduce underrepresentation. The chapter includes various recommendations, based on concrete experience with the LISS panel, on how to (re)balance survey compliance and participation.
The chapter by Arthur van Soest and Arie Kapteyn is on mode and context effects in measuring household assets. Differences in answers in Internet and traditional surveys can be due to selection, mode, or context effects. The chapter exploits unique experimental data from the U.S. Health and Retirement Study (HRS) to analyze mode effects, controlling for arbitrary selection. Moreover, exploiting the panel nature of the data, the quality of core and Internet answers are compared. The chapter focuses on economic variables such as household assets, for which mode effects in Internet surveys have rarely been studied.
Dirk Heerweghās chapter is on a highly topical issue: paradata. Paradata, also termed āprocess data,ā contain information about the primary data collection process (e.g., survey duration, interim status of a case, navigational errors in a survey questionnaire, etc.). Paradata can provide a means of additional control over, or understanding of, the quality of the primary data (the responses to the survey questions). The chapter describes how paradata can be collected in Internet surveys, some of its uses, and data preparation issues that may need to be addressed before paradata can be analyzed. The main focus is on paradata in the process of completing an Internet survey.
Mirta Galesic and Ting Yan discuss the use of eye-tracking techniques in survey research. Eye-tracking data are highly important in studying survey response processes. The chapter outlines a number of research questions that eye tracking can help to answer and presents an overview and history of eye-tracking techniques. The authors go into the comparative methodological, technical, and analytical advantages and disadvantages of eye tracking over more traditional, indirect methods for tracking respondentsā behavior (such as measuring response times, recording changes in answers, and tracking mouse clicks and movements). The chapter furthermore provides a summary of a series of recent experiments in which respondentsā eye movements were tracked while they were completing a CASI survey.
The chapter by Mauricio Avendano, Annette C. Scherpenzeel, and Johan P. Mackenbach examines the advantages and disadvantages of collecting biomarker data in population surveys, discusses the available methods for doing so, and explores the feasibility of collecting these data in an Internet panel. The authors report the results of a pilot that tested the feasibility of collecting data on blood cholesterol, saliva cortisol, and waist circumference in a subsample of the LISS panel. Collecting biomarker data in an Internet panel is feasible, but specific conditions need to be met for collecting, storing, and analyzing these data.
The final chapter, by Marcel Das, Peter Ester, and Lars Kaczmirek, draws the main conclusions of the book, puts these into perspective, and discusses the consequences for the research agenda on the use of Internet social surveys.
REFERENCES
ADM. (2008). Jahresbericht 2008. Frankfurt: Arbeitskreis. Deutscher Markt- und Sozialforschungsinstitute e. V. Retrieved from http://www.adm-ev.de/pdf/Jahresbericht_04.pdf
ESOMAR. (2007). Global Prices Study 2007. Amsterdam: ESOMAR.
European Commission. (2008). Flash Eurobarometer 241: Information Society as Seen by EU Citizens. Retrieved from http://ec.europa.eu/public_opinion/flash/fl_241_en.pdf
Part I
Methodology in Internet Survey Research
2
Internet Survey Methods: A Review of Strengths, Weaknesses, and Innovations
Jolene D. Smyth
Survey Research and Methodology ...