Statistical Inference and Probability
eBook - ePub

Statistical Inference and Probability

John MacInnes

  1. 160 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Statistical Inference and Probability

John MacInnes

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

An experienced author in the field of data analytics and statistics, John Macinnes has produced a straight-forward text that breaks down the complex topic of inferential statistics with accessible language and detailed examples. It covers a range of topics, including:

· Probability and Sampling distributions

· Inference and regression

· Power, effect size and inverse probability

Part of The SAGE Quantitative Research Kit, this book will give you the know-how and confidence needed to succeed on your quantitative research journey.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Statistical Inference and Probability est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Statistical Inference and Probability par John MacInnes en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Sozialwissenschaften et Wissenschaftliche Forschung & Methodik. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2022
ISBN
9781529711028

1 The Challenge and Promise of Inference

Chapter Overview

  • What is inference? 2
  • Informal inference: the tyranny of causal narratives 2
  • Cognitive illusions 4
  • Scientific inference 5
  • Statistical inference 6
  • Exploration and inference: detectives and lawyers 10
  • The NHST wars 11
  • Inference, reproducibility and replication 15
  • Inference in action: the Salk Vaccine trial 15
  • Inference in action: fertility and development 17
  • The world before statistics 18
  • What knowledge I assume 19
  • The structure of this book 19
  • Further Reading 20

What is inference?

Inference is the process of drawing reasoned but risky conclusions from empirical evidence.
I enter a room and see a person with a smoking gun in their hand beside a body with gunshot wounds. I could infer that the person holding the gun had shot the other one. Other evidence might be relevant to my conclusion. I might know the two people were mortal enemies. I might just have heard a gunshot. However, my inference and the conclusion I draw from it would be a risky one because there is only some probability that it is correct. Perhaps the body is a suicide, and the person now holding the gun had been desperately trying to wrestle it from the victim. I did not witness the shot directly. Even if I had, I would still need to be sure that it was not some visual trick or illusion and that everything was indeed as it seemed on the surface.
This is the situation we face with most evidence. Many of the processes we try to understand are invisible. We cannot ‘see’ class, ethnic discrimination, economic growth or the rise of populism directly – if we could, there would be little need for social science – rather, we can collect evidence about the results of these processes and build models of what we think may be happening to produce that evidence. That is what scientific inference comprises.
It is helpful to think of three categories of inference: (1) informal inference, which is something intuitive we do all the time; (2) scientific inference, which is a set of rules laid down to minimise the risk of drawing unsound conclusions; and (3) statistical inference, which is the part of scientific inference that deals with generalising evidence taken from samples to the whole populations from which these samples have been drawn. Almost all the evidence we ever work with is a sample of some kind, so that statistical inference is a fundamental part of the scientific method.

Informal inference: the tyranny of causal narratives

We are inference machines who understand the world around us by constantly drawing barely conscious conclusions, and effortlessly constructing plausible causal narratives based upon them, both to justify ourselves to others and to reassure ourselves that we understand the world and our place in it. Much of our experience of the world proceeds by induction, whereby finding repeated examples of the same thing or process leads us to expect that under similar conditions we will nearly always find them again. Without the existential reassurance provided by induction, the world would appear as a rather terrifying and unpredictable chaos. Our everyday behaviour in the world is rooted in continually updating our awareness of what is happening around us, drawing conclusions from it and telling stories to ourselves. I was happy today because it was sunny. I got better because I took that medicine. The wood in the stove burned because I set fire to it. I was late for work because the train was delayed. The object fell because of gravity. My accident happened because the cyclist didn’t see me coming. We usually process these causal stories below the level of conscious awareness or calculation. I recognise that person because I have seen them before. I know that person is angry because of their facial expression and so on. I stepped out onto the road to cross it because I didn’t hear any traffic (and consequently collided with the cyclist I didn’t look out for).
Most of these inferences will be at least accurate enough to ensure our well-being. We learn to judge the speed and likely behaviour of traffic, the amount of effort we need to put into writing an essay that will pass or the kind of clothes that make us warm, comfortable or feel sexy. We are all extremely good at making causal associations: links between one phenomenon or pattern and another. When we look at someone’s face, without even having to think consciously about it, we can infer how they are feeling, whether they are angry, happy, sad or bored. We make an association between their facial expression, the configuration of their eyes and eyebrows, mouth, nose and forehead and how we imagine them to be feeling. We will also, usually without consciously thinking about it, produce an explanation of the feeling we suppose them to be experiencing.
However, many of our informal inferences may be wildly wrong and our causal narratives usually flatter to deceive. I do not know that the medicine cured my ailment: perhaps I would have recovered without it. Perhaps if I had taken the bus I would have arrived on time, or any number of other events might have occurred. I could have left the house earlier and caught a different train. Maybe it was the sunshine that made me happy, but how could I know? Perhaps my mood would have been the same had the day been cloudy and overcast. Until Newton’s work on motion, I might have said that the object fell because it was heavy. In the early 18th century, I might have said the wood burned because it contained phlogiston. For me to have that accident a whole set of circumstances were required – from the invention of the bicycle to my decision to cross the road, the existence of the road, all the reasons for the cyclist to be there too at that precise time, their inability to take evasive action, the weather conditions at the time and so on and on and on.
Thus, while we might imagine that we understand the world around us through causal narratives, our perceptions have at best a tenuous link to empirical reality, and any but the most superficial understanding would in principle require virtual omniscience about all manner of diverse causal chains and their history. It would be impossible. Worse, it would leave no room for any knowledge of the world to be of any use! In his Philosophical Essay on Probabilities, published in 1814, the French mathematician Pierre-Simon Laplace imagined an omniscient ‘intelligence’ that later philosophers came to refer to as his ‘demon’:
We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes. (Laplace, 1951, p. 4)
This vision of a purely causal world without randomness later came to be known as a ‘block universe’: a world in which determinism squeezed out any possibility of change or evolution. In such a world, knowing everything would, paradoxically, be utterly valueless, since the fact of that knowledge could not change anything in the present, past or future. To have room to breathe, and space for change, we need randomness and probability.

Cognitive illusions

Cognitive psychologists have a fairly good picture of the many kinds of systematic biases that drive our everyday inferences. Kahneman (2011) calls them ‘cognitive illusions’, which are similar to the visual illusions or tricks of perspective that you may be familiar with, in which our vision tricks us into seeing things that are not really there. We usually pay too much attention to easily available evidence, regardless of its relevance or quality, a phenomenon he refers to as ‘WYSIATI: What You See Is All There Is’. We may also substitute questions we find hard to answer with similar ones that we find easier, but which may have little to do with the issue at hand. ‘Is that person fair and reasonable?’ may morph into ‘do I like them?’ We avoid numerical calculations (they demand effort, are slow and prone to mistakes) and substitute rough estimates which we then treat with too much certainty. We see patterns where there are none. We rarely evaluate or check up on our inferences or predictions and may even persuade ourselves that we never actually held a belief that later turned out to be faulty. Above all, we practice confirmation bias remorselessly. We avoid or disregard evidence that goes against our existing beliefs, while seizing upon information that can be interpreted to support them. We can persist in this even in the face of overwhelming odds, until something truly traumatic or catastrophic forces a change in our views. Kahneman has therefore described our informal inference making as a ‘machine for jumping to conclusions’. It is plausible (but not more than that) to imagine that such a machine had evolutionary survival value. Imagining a potential threat that turns out to be a false alarm carries little penalty. Ignoring one that turns out to be real might be deadly.
These processes are one reason why the history of civilisation is adorned with all manner of fanciful tales and characters such as werewolves, demons, spirits, miracles and so on. The range of bizarre beliefs that credulous societies have solemnly subscribed to is long. Until a few hundred years ago, inferring from the pattern of sunrise and sunset that the sun circled the earth, which was flat, and that matter was solid, would simply have appeared to be obvious. Do not, for a moment, believe that the society we inhabit today is an exception. No society has ever believed itself to be systematically deluded. Because of our ability to effortlessly and systematically deceive ourselves, we need to turn off our semi-automatic cognitive machinery in order to think scientifically. It is probability that allows us to do this, by imagining the world not in terms of causal chains (although these may well exist) but in terms of collections of events with different probabilities of occurring. We may observe patterns of association within these events that are themselves probabilistic in character, so that we can explore what events may increase or decrease the probability of other events taking place. However, most important of all, we grant this external world of events the authority to test our theories about the world, leaving nothing to the reputation or skill of the scientific investigator, no matter how renowned. Scientific inference puts empirical evidence in charge. As we shall see later, this has the paradoxical effect of changing the nature of discovery, knowledge and insight. Rather than an ever-growing accumulation of proven ‘facts’, discovery mostly comprises the revision or destruction of parts of the provisional knowledge we currently hold, as we become more aware of just how boundless is our ignorance. It becomes, to borrow a phrase, the discovery of hitherto unknown unknowns.

Scientific inference

Scientific inference attempts to circumvent these cognitive biases by insisting that inference is based on evidence marshalled in such a way that the individual scientist has as little control over it as possible, in order to maximise the probability that the conclusions reached are sound by tackling confirmation bias, tunnel vision, wobbly logic or loaded arguments. Paradoxically, taking the scientist out of the science often requires complex research designs that require a good deal of scientific expertise and experience to construct. This does not mean that ‘the facts speak for themselves’. If facts could indeed speak, there would be no need for any science. We could just listen to what the world told us. On the contrary, science requires the patience to establish just what the facts actually are: something that is usually far more difficult than one might expect.
Scientific inference is the process whereby any description of the world and how it functions – usually referred to as a theory – must be tested in some way against empirical evidence. Inference takes place in our heads but uses empirical data from the world outside them. We accept or reject theories not on the basis of the authority, prestige or intelligence of their author, but on their ability to account for empirical evidence that we observe. By definition, we can never ‘prove’ a theory completely because, even if it were totally consistent with all the evidence anyone had ever discovered, we cannot know what evidence might arrive in the future. Because the evidence we have is incomplete, the conclusions we can draw are usually provisional or risky, in the sense that they may be wrong or that new evidence may persuade us to revise or improve them. Indeed, it makes sense to make such falsification one component of what counts as knowledge. Claims or statements that cannot be falsified count only as articles of faith. Scientific inference not only places a premium on the critique of existing knowledge but also requires the assimilation of new findings to some cumulative body of knowledge. What might at first appear to be a recipe for disaster – we know nothing with certainty – is actually a means to ensure that knowledge can accumulate and improve over time: it is open to growth. It does place stringent demands on the way in which evidence is produced and interpreted, but the pay-off from this effort is astounding. The development of knowledge, material progress and standards of living was extremely slow between the Neolithic revolution and the 17th century. The scientific revolution changed all that.

Statistical inference

In the social sciences, we use evidence that is systematically incomplete because it comprises data drawn from samples. Almost all of our knowledge of the world comes from them, because the social and natural worlds are just too vast, too complex and too dynamic to measure directly. Not only is it slow and expensive to collect information on millions of individuals, such a mammoth study would waste resources, and almost certainly generate data of poor quality. It would be very difficult to ensure that the measurements obtained were consistent and the data quality was good when collection requires a veritable army of enumerators. There would not only need to be checks, but checks on the checks, and checks on . . .That is why population censuses are usually only carried out once a decade and restricted to a very short list of questions. The last census in the UK cost only 87p per person per census year: a remarkably small amount considering the volume of information obtained, the logistical difficulties of attempting to track down everyone on census night and the marathon effort of checking the data collected and correcting the errors. However, it works out a total cost of £480,000,000: equal to the capital cost of a large hospital. The estimated cost of the 2020 US Census was $16 billion. In the natural world measuring populations, whether of animals, rocks, atoms or molecules, would simply be impossible. Some measurements or tests destroy or damage what is measured. Components may be quality tested to destruction, to assess their resilience. I’m happy for doctors to take a sample of my blood. I’d protest if they insisted on examining all of it. Finally, our interest is often in future populations, which cannot be measured for the simple reason that they do not yet exist. We have questions such as ‘What would be the effect of . . .? or ‘What would happen if . . .?’ However, we may be able to generalise to such future populations from samples drawn from an existing one.
In all these situations, the sample data is of little interest in itself. Who cares what a small random sample of subjects might do or think? It is the population that is our real interest. But if we can generalise from our sample to the much larger population that it represents, we have a tremendously powerful analytical tool. Perhaps the greatest discovery of 19th-century science was the mathematical logic of how to infer the characteristics of target populations from random samples drawn from them. Because our data is limited, we need a good theory of how much we can generalise from the data we have observed in our research, to the wider world from which that data was taken, or to worlds which do not yet exist but lie either in the future or only ever in our imagination. The logic of how to do this is statistical inference. In its simplest form, it asks one of two questions:
  1. Is what I have observed in my sample also what I would find in the target population if I could measure it?
  2. If I assume the target population has feature X, how likely is it that I would see the data I observe in my sample?
Statistical inference is used to decide whether a pattern discovered in one batch of data that we have analysed is one that we would also be likely to find in other data that we have not, and often could not have, collected. The batch we examine is usually some kind of sample drawn from a target population. Often, we use these questions to do one of two things. We might want to produce an estimate of the size of something. What proportion of this population are women? What are the average earnings of social science graduates? At what age do women tend to have their first child? Or we might want to test whether or not some proposition about the population is true. We call such a proposition a hypothesis. Do science graduates earn more than social science graduates? Is average age at first birth older for women now or 20 years ago? Are those who identify as white more likely to vote Republican than others?
Inference can be used to travel in opposite logical directions. We can use the observed characteristics of individuals who comprise a random sample to infer information about the populations from which those individuals were drawn. We cannot observe or measure the whole population, but we can observe and measure the individuals. Conversely, we can use information we have about populations to infer otherwise unobservable or contested characteristics of individuals. For example, if we know the overall rate in a population for which a medical test returns a correct pos...

Table des matiĂšres

  1. Cover
  2. Half Title
  3. Series
  4. Title Page
  5. Copyright Page
  6. Acknowledgements
  7. Contents
  8. Illustration List
  9. About the Author
  10. 1 The Challenge and Promise of Inference
  11. 2 Probability, Randomness, Probability Distributions and Sampling Distributions
  12. 3 Bernoulli, Coke and Pepsi
  13. 4 Samples and Populations
  14. 5 Inference and Regression
  15. 6 Power, Effect Size and Inverse Probability
  16. 7 What Does Sound Inference Comprise?
  17. Glossary
  18. References
  19. Index
Normes de citation pour Statistical Inference and Probability

APA 6 Citation

MacInnes, J. (2022). Statistical Inference and Probability (1st ed.). SAGE Publications. Retrieved from https://www.perlego.com/book/3277489/statistical-inference-and-probability-pdf (Original work published 2022)

Chicago Citation

MacInnes, John. (2022) 2022. Statistical Inference and Probability. 1st ed. SAGE Publications. https://www.perlego.com/book/3277489/statistical-inference-and-probability-pdf.

Harvard Citation

MacInnes, J. (2022) Statistical Inference and Probability. 1st edn. SAGE Publications. Available at: https://www.perlego.com/book/3277489/statistical-inference-and-probability-pdf (Accessed: 15 October 2022).

MLA 7 Citation

MacInnes, John. Statistical Inference and Probability. 1st ed. SAGE Publications, 2022. Web. 15 Oct. 2022.