Basic Statistics and Epidemiology
eBook - ePub

Basic Statistics and Epidemiology

A Practical Guide

Antony Stewart

  1. 204 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Basic Statistics and Epidemiology

A Practical Guide

Antony Stewart

Book details
Book preview
Table of contents
Citations

About This Book

This straightforward primer in basic statistics and epidemiology emphasises their practical use in healthcare and public health, providing understanding of essential topics such as study design, data analysis and statistical methods used in the execution of medical research. Assuming no prior knowledge, the clarity of the text and care of presentation ensure those new to, or challenged by, these topics are given a thorough introduction without being overwhelmed by unnecessary detail.

Key features:

  • Provides an excellent grounding in the basics of both statistics and epidemiology


  • Full step-by-step guidance on performing statistical calculations


  • Numerous examples and exercises with detailed answers to help readers navigate these complex subjects with ease and confidence


  • Enables students and practitioners to make sense of the many research studies that underpin evidence-based practice


  • Fully revised and updated for this fifth edition, now with additional exercises and question and answers online for self-testing


An understanding and appreciation of statistics is central to ensuring that professional practice is based on the best available evidence, in order to best treat and help the wider community. Reading this book will help students, researchers, doctors, nurses, and health managers to understand and apply the tools of statistics and epidemiology to their own practice.

Frequently asked questions

Simply head over to the account section in settings and click on ā€œCancel Subscriptionā€ - itā€™s as simple as that. After you cancel, your membership will stay active for the remainder of the time youā€™ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlegoā€™s features. The only differences are the price and subscription period: With the annual plan youā€™ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weā€™ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Basic Statistics and Epidemiology by Antony Stewart in PDF and/or ePUB format, as well as other popular books in Medicine & Public Health, Administration & Care. We have over one million books available in our catalogue for you to explore.

Information

Publisher
CRC Press
Year
2022
ISBN
9781000506372

1What are statistics?

DOI: 10.1201/9781003148111-1
We use statistics every day, often without realising it. Statistics as an academic study has been defined as follows:
ā€œThe science of assembling and interpreting numerical dataā€
(Bland, 2000).
ā€œThe discipline concerned with the treatment of numerical data derived from groups of individualsā€
(Armitage et al., 2002).
A statistic can be defined as a summary value calculated from data, for example an average or proportion. The term data refers to ā€˜items of informationā€™ and is plural.
Letā€™s have a look at some real examples of healthcare statistics:
  1. 1Ā Ā Recorded deaths in the UK from COVID-19 (confirmed with a positive test) rose by 351 to a total of 36,393 on 22nd May 2020 (GOV.UK, 2020).
  2. 2Ā Ā Antibiotics shorten the duration of sore throat pain symptoms by an average of about one day (Spinks et al., 2013).
  3. 3Ā Ā Smokers lose at least 10 years of life expectancy, compared with those who have never smoked (Jha et al., 2013).
(after Rowntree, 1981)
When we use statistics to describe data, they are called descriptive statistics. All of the above three statements are descriptive.
However, as well as just describing data, statistics can be used to draw conclusions or to make predictions about what may happen in other subjects. This can apply to small groups of people or objects, or to whole populations. A population is a complete set of people or other subjects which can be studied. A sample is a smaller part of that population.
For example, ā€œall the smokers in the USā€ (or any other specific country) can be regarded as a population. In a study on smoking in this population, it would be impossible to study every single smoker. We might therefore choose to study a smaller group of, say, 1,000 smokers. These 1,000 smokers would be our sample. (Note: of course, it would be important to agree on a definition of ā€œsmokerā€. For example, a ā€œsmokerā€ could be someone who currently smokes, or perhaps someone who has smoked at some point in their life, or who only smokes occasionally. We may also want to restrict our sample to smokers who only smoke cigarettes or a particular tobacco product).
Using statistics to draw conclusions about a whole population using results from our samples, or to make predictions of what will happen is called statistical inference. It is important to recognise that when we use statistics in this way, we never know exactly what the true results in the population will be with absolute certainty.
Of course, it is important that data are sampled correctly (so they are representative of the relevant population), recorded accurately, analysed properly using appropriate techniques, interpreted correctly ā€“ and then reported honestly.
The true quantities of the population (which are rarely known for certain) are called parameters.
Different types of data and information call for different types of statistics. Some of the commonest situations are described on the following pages.
Before we go any further, a word about the use of computers and formulae in statistics. There are several excellent computer software packages and online resources that can perform statistical calculations more or less automatically. Some of these packages are available free of charge, while some cost well over Ā£1000. Each package has its own merits, and careful consideration is required before deciding which one to use. These packages can avoid the need to work laboriously through formulae and are especially useful when one is dealing with large samples. However, care must be taken when interpreting computer outputs, as will be demonstrated later by the example in Chapter 6. Also, computers can sometimes allow one to perform statistical tests that are inappropriate. For this reason, it is vital to understand factors such as the following:
  • which statistical technique should be performed
  • why it is being performed
  • what data are appropriate
  • how to interpret the results.
Several formulae appear on the following pages, some of which look fairly horrendous. Donā€™t worry too much about these ā€“ you may never actually need to work them out by hand. However, you may wish to work through a few examples in order to get a ā€˜feelā€™ for how they work in practice. Working through the exercises in Appendix 2 and the website will also help you. Remember, though, that the application of statistics and the interpretation of the results obtained are what really matter.

2Populations and samples

DOI: 10.1201/9781003148111-2
It is important to understand the difference between populations and samples. You will remember from the previous chapter that a population can be defined as every subject in a country, a town, a district or other group being studied. Imagine that you are conducting a study of post-operative infection rates in a hospital during 2019. The population for your study (called the target population) is everyone in that hospital who underwent surgery during 2019. Using this population, a sampling frame can be constructed. This is a list of every person in the population from whom your sample will be taken. Each individual in the sampling frame is usually assigned a number, which can be used in the actual sampling process.
If thousands of operations were performed during 2019, there may not be time or resources to look at every case history. It may therefore only be possible to look at a smaller group (e.g., 100) of these patients. This smaller group is a sample.
Remember that a statistic is a value calculated from a sample, which describes a particular feature. This means it is always an estimate of the true value.
If we take a sample of 100 patients who underwent surgery during 2019, we might find that 7 of them developed a post-operative infection. However, a different sample of 100 patients might have identified five post-operative infections, and yet another might find eight. We shall almost always find such differences between samples, and these are called sampling variations.
When undertaking a scientific study, the aim is usually to be able to generalise the results to the population as a whole. Therefore, we need a sample that is representative of the population. Going back to our example of post-operative infections, it is rarely possible to collect data on everyone in a population. Methods have therefore been developed for collecting sufficient data to be reasonably certain that the results will be accurate and applicable to the whole population. The random sampling methods that are described in the next chapter are among those used to achieve this.
Thus, we usually have to rely on a sample for a study, because it may not be practicable to collect data from everyone in the population. A sample can be used to estimate quantities in the population as a whole, and to calculate the likely accuracy of the estimate.
Many sampling techniques exist, and these can be divided into non-random and random techniques. In random sampling (also called probability sampling), everyone in the sampling frame has an equal probability of being chosen (unless stratified sampling is being used ā€“ this is described in Chapter 3). Random sampling aims to make the sample more representative of the population from which it is drawn. It also helps avoid bias and ensure that statistical methods of inference or estimation will be valid. There are several methods of random sampling, some of which are discussed in the next chapter. Non-random sampling (also called non-probability sampling) does not have these aims but is usually easier and more convenient to perform, though conclusions will always be less reliable.
Convenience or opportunistic sampling is the crudest type of non-random sampling. This involves selecting the most convenient group available (e.g., using the first 20 colleagues you see at work). It is simple to perform but is unlikely to result in a sample that is either representative of the population or replicable.
A commonly used non-random method of sampling is quota sampling, in which a predefined number (or quota) of people who meet certain criteria are surveyed. For example, an interviewer may be given the task of interviewing 25 women with toddlers in a town centre on a weekday morning, and the instructions may specify that 7 of these women should be aged under 30 years, 10 should be aged between 30 and 45 years, and 8 should be aged over 45 years. While this is a convenient sampling method, it may not produce results that are representative of all women with children of toddler age. For instance, the described example will systematically exclude women who are in full-time employment.
As well as using the correct method of sampling, there are also ways of calculating a sample size that is appropriate. This is important, since increasing the sample size will tend to increase the accuracy of your estimate, while a smaller sample size will usually decrease the accuracy. Furthermore, the right sample size is essential to enable you to detect a real effect, if one exists. The appropriate sample size can be calculated using one of several formulae, according to the type of study and the type of data being collected. The basic elements of sample size calculation are discussed in Chapter 21. Sample size calculation should generally be left to a statistician or someone with a good knowledge of the requirements and procedures involved. If statistical significance is not essential, a sample size of between 50 and 100 may suffice for many purposes.

3Random sampling

DOI: 10.1201/9781003148111-3
Random selection of samples is another important issue. For a sample to be truly representative of...

Table of contents

  1. Cover
  2. Half Title
  3. Title
  4. Copyright
  5. Dedication
  6. Contents
  7. Preface
  8. Acknowledgements
  9. 1 What are statistics?
  10. 2 Populations and samples
  11. 3 Random sampling
  12. 4 Presenting data
  13. 5 Frequencies, percentages, proportions and rates
  14. 6 Types of data
  15. 7 Mean, median and mode
  16. 8 Centiles
  17. 9 Standard deviation
  18. 10 Standard error
  19. 11 Normal distribution
  20. 12 Confidence intervals
  21. 13 Probability
  22. 14 Hypothesis tests and P-values
  23. 15 The t-tests
  24. 16 Data checking
  25. 17 Parametric and non-parametric tests
  26. 18 Correlation and linear regression
  27. 19 Analysis of variance and some other types of regression
  28. 20 Chi-squared test
  29. 21 Statistical power and sample size
  30. 22 Effect size
  31. 23 What is epidemiology?
  32. 24 Bias and confounding
  33. 25 Measuring disease frequency
  34. 26 Measuring association in epidemiology
  35. 27 Cross-sectional studies
  36. 28 Questionnaires
  37. 29 Cohort studies
  38. 30 Case-control studies
  39. 31 Randomised controlled trials
  40. 32 Screening
  41. 33 Evidence-based healthcare
  42. Glossary of terms
  43. Appendix 1 Statistical tables
  44. Appendix 2 Exercises
  45. Appendix 3 Answers to exercises
  46. References
  47. Further reading: a selection
  48. Index