Practical Statistics for Field Biology
eBook - ePub

Practical Statistics for Field Biology

Jim Fowler, Lou Cohen, Philip Jarvis

Share book
  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Practical Statistics for Field Biology

Jim Fowler, Lou Cohen, Philip Jarvis

Book details
Book preview
Table of contents
Citations

About This Book

Provides an excellent introductory text for students on the principles and methods of statistical analysis in the life sciences, helping them choose and analyse statistical tests for their own problems and present their findings. An understanding of statistical principles and methods is essential for any scientist but is particularly important for those in the life sciences. The field biologist faces very particular problems and challenges with statistics as "real-life" situations such as collecting insects with a sweep net or counting seagulls on a cliff face can hardly be expected to be as reliable or controllable as a laboratory-based experiment. Acknowledging the peculiarites of field-based data and its interpretation, this book provides a superb introduction to statistical analysis helping students relate to their particular and often diverse data with confidence and ease. To enhance the usefulness of this book, the new edition incorporates the more advanced method of multivariate analysis, introducing the nature of multivariate problems and describing the the techniques of principal components analysis, cluster analysis and discriminant analysis which are all applied to biological examples. An appendix detailing the statistical computing packages available has also been included. It will be extremely useful to undergraduates studying ecology, biology, and earth and environmental sciences and of interest to postgraduates who are not familiar with the application of multiavirate techniques and practising field biologists working in these areas.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Practical Statistics for Field Biology an online PDF/ePUB?
Yes, you can access Practical Statistics for Field Biology by Jim Fowler, Lou Cohen, Philip Jarvis in PDF and/or ePUB format, as well as other popular books in Biological Sciences & Biology. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Wiley
Year
2013
ISBN
9781118685648
Edition
2

1

INTRODUCTION

1.1 What do we mean by statistics?

Statistics are a familiar and accepted part of the modern world, and already intrude into the life of every serious biologist. We have statistics in the form of annual reports, various censuses, distribution surveys, museum records – to name just a few. It is impossible to imagine life without some form of statistical information being readily at hand.
The word statistics is used in two senses. It refers to collections of quantitative information, and methods of handling that sort of data. A society’s annual report, listing the number or whereabouts of interesting animal or plant sightings, is an example of the first sense in which the word is used. Statistics also refers to the drawing of inferences about large groups on the basis of observations made on smaller ones. Estimating the size of a population from a capture–recapture experiment illustrates the second sense in which the word is used.
Statistics, then, is to do with ways of organizing, summarizing and describing quantifiable data, and methods of drawing inferences and generalizing upon them.

1.2 Why is statistics necessary?

There are two reasons why some knowledge of statistics is an important part of the competence of every biologist. First, statistical literacy is necessary if biologists are to read and evaluate their journals critically and intelligently. Statements like, ‘the probability that a first-year bird will be found in the North Sea is significantly greater than for an older one, χ2 = 4.2, df = 1, P <0.05’, enable the reader to decide the justification of the claims made by the particular author.
A second reason why statistical literacy is important to biologists is that if they are going to undertake an investigation on their own account and present their results in a form that will be authoritative, then a grasp of statistical principles and methods is essential. Indeed, a programme of work should be planned anticipating the statistical methods that are appropriate to the eventual analysis of the data. Attaching some statistical treatment as an afterthought to make the study seem more ‘respectable’ is unlikely to be convincing.

1.3 Statistics in field biology

‘Laboratory’ biologists may have high levels of confidence in the precision and accuracy of the measurements they make. To them, collecting meadow dwelling insects with a sweep net might appear a hilarious exercise with a ludicrously low level of reliability. Field biologists therefore require special sampling procedures and analytical methods if their assertions are to be regarded with credibility. Often data accumulated do not conform to the sort of symmetrical patterns taken for granted in the common statistical techniques; data may be ‘messy’, irregular or asymmetrical. Special treatments may be necessary before they can be properly evaluated.

1.4 The limitations of statistics

Statistics can help an investigator describe data, design experiments, and test hunches about relationships among things or events of personal interest. Statistics is a tool which helps acceptance or rejection of the hunches within recognized degrees of confidence. They help to answer questions like, ‘If my assertion is challenged, can I offer a reasonable defence?’, ‘Am I justified in spending more time or resources in pursuing my hunch?’, or ‘Can my observations be attributable to chance variation?’.
It should be noted that statistics never prove anything. Rather they will indicate the likelihood of the results of an investigation being the product of chance.

1.5 The purpose of this text

The objectives of this text stem from the points made in Sections 1.2 and 1.3 above. First, the text aims to provide field biologists with sufficient grounding in statistical principles and methods to enable them to read and understand research reports in the journals they read. Second, the text aims to present biologists with a variety of the most appropriate statistical tests for their problems. Third, guidance is offered on ways of presenting the statistical analyses, once completed.

2

MEASUREMENT AND SAMPLING CONCEPTS

2.1 Populations, samples and observations

Biologists are familiar with the term population as meaning all the individuals of a species that interact with one another to maintain a homogeneous gene pool. In statistics, the term population is extended to mean any collection of individual items or units which are the subject of investigation. Characteristics of a population which differ from individual to individual are called variables. Length, mass, age, temperature, proximity to a neighbour, number of parasites, number of petals, to name but a few, are examples of biological variables to which numbers or values can be assigned. Once numbers or values have been assigned to the variables they can be measured.
Because it is rarely practicable to obtain measures of a particular variable from all the units in a population, the investigator has to collect information from a smaller group or sub-set which represents the group as a whole. This sub-set is called a sample. Each unit in the sample provides a record, such as a measurement, which is called an observation. The relationship between the terms we have introduced in this section is summarized below:
Observation: 132 mm
Variable: wing length
Sample unit: a starling from a communal roost
Sample: those starlings which are captured in the roost and are measured
Statistical population: all starlings in the roost which are available for capture and measurement
Biological population: the biological population may well include birds that are not available for capture (e.g. mates that are roosting elsewhere) and are therefore not part of the statistical population. Alternatively, if the roost comprises a mixture of resident birds and winter immigrants, the statistical population might include components of more than one biological population.

2.2 Counting things – the sampling unit

Field biologists often count the number of objects in a group or collection. If the number is to be meaningful, the dimensions of the collection have to be specified. A collection with specified dimensions is called a sampling unit; a set of sampling units comprise a sample. An observation is, of course, the number of objects in a particular sampling unit. Examples of sampling units are:
Observation Sampling unit
Number of orchids A quadrat of stated area
Number of crickets in a sweep net Volume of vegetation swept (diameter of net × distance moved)
Number of nematodes in a soil core Soil core of stated dimensions
Number of visits by bees to a flower A specified interval of time
Number of wading birds on a shore A specified length of coastline
Number of ectoparasites A single host
Number of beetles in a pitfall trap A trap of stated size
When observations are counts, the statistical population has nothing to do with the objects we are counting, even when they are organisms. The following example illustrates the point.
Observation: 23
Variable: number of cockles
Sampling unit: a square quadrat of stated area from which sand is sieved and cockles counted
Sample: the number of quadrats (sampling units) examined
Statistical population: the total number of quadrats it is possible to mark out in the whole of the study area. The potential number of units in the population depends on the chosen dimensions of the sampling unit.
The main difference between ‘measuring’ and ‘counting’ is that we have no control over the dimensions of a unit in a sample when we are measuring; when counting, we are able to choose the dimensions of the sampling unit. Remember that the content of a trap, net or quadrat is a sample if we are measuring the objects in it, but only a unit in a sample if we are counting them.
It is always worthwhile to ask the question, ‘from which population are my sampling units drawn?’. The answer may not always be as obvious as in the example of the cockles. The contents of 10 pit-fall traps set into the ground overnight constitute a sample – but from which population are these sampling units drawn? It is regarded as being the total number of traps that could have been set out, covering the whole of the study area. Because it is axiomatic that field biologists try not to destroy the habitat they are studying, a statistical population is sometimes notional, or hypothetical.

2.3 Random sampling

We say in Section 2.1 that a sample represents the population from which it is drawn. If the sample is to be truly representative, the units in the sample must be drawn randomly from the population; that is to say, in a manner that is free from bias. In other words, each unit in a population must have an equal chance of being drawn.
As an example of a possible source of bias, consider a biologist who wishes to measure the average mass of bank voles Clethrionomys glareolus inhabiting a study site. Attempts are made to catch them by setting Longworth mammal traps baited with grain. Before capture, an animal has to overcome trap shyness. It is plausible that the threshold of shyness is lower in hungry animals than in well-fed ones and that the former may have a greater chance of being drawn from the population. If hungry voles are lighter than well-fed ones, our biologist’s sample may not be a fair representation of the whole population.
Statistical analysis is frequently conducted on the assumption that samples are random. If, for any reason, that assumption is false and bias is present in the sampling procedure, then the information gained from the sample may not be properly extrapolated to the population. Unfortunately, it is rarely possible to do more than guess how great bias may be. This severely reduces the confidence which can be placed in estimations based on sampling data. Since most sources of bias arise from the methodology adopted, procedures should always be fully described. When a source of bias is suspected, it should be acknowledged and taken into account in the interpretation of results. The practical aspects of obtaining random samples is a large area in itself, partly because the field techniques used by biologists are so diverse. We suggest you consult Southwood (1978) as a standard work on this subject (see Bibliography).

2.4 Random numbers

One way to avoid bias is to assign a unique number to each individual unit in a population and select units to be measured by reference to random numbers. Often this is impossible because we cannot always choose our units – we measure what we can catch, as in the example of the voles. However, it is sometimes possible – indeed essential – to obtain truly random sampling units. In the case of our cockle example in Section 2.2, the quadrats comprising the sample could be located at the intersection of grid coordinates prescribed by pairs of random numbers. Whenever there is opportunity to select ‘which plots?’, ‘which pools?’, or ‘which positions?’, then selection must be based on random numbers.
There are two usual ways of obtaining random numbers. First, many calculators and pocket computers have a facility for generating random numbers. These are often in the form of a fraction, e.g. 0.2771459. You may use this to provide a set of integers, 2, 7, 7, 1,…, or 27, 71, 45,…; or 277, 145,…; or 2.7, 7.1, …; and so on, keying a new number when more digits are required.
Second, use may be made of random number tables. Appendix 1 is such a table. The numbers are arranged in groups of five in rows and columns, but this arrangement is arbitrary. Starting in the top left corner you may read, 2, 3, 1, 5, 7, 5, 4,…; or 23, 15, 75, 48,…; or 231, 575, 485,…; or 23.1, 57.5, 48.5, 90.1,…; and so on, according to your needs. When you have obtained the numbers you need for the investigation in hand, mark the place with a pencil. Next time, carry on where you left off.
It is possible, by chance, that a random number will prescribe a unit that has already been drawn. In this event, ignore the number and take the next random number. The purpose is to eliminate your prejudice as to which units should be picked for measurement or counting. Unfortunately, observer bias, conscious or subconscious, is notoriously difficult to avoid when gathering data in support of a particular hunch!

2.5 Independence

Many statistical methods assume that observations in a sample are independent. That is to say, the value of any one observation in a sample is not inherently linked to that of another. An example should make this clear. A biologist wishes to compare the average spikelet length of rough meadow grass growing in one field with that growing in another. One hundred flowering heads are obtained randomly from the first field, a spikelet is removed from each and measured. In the second field, the plant is harder to find and only 80 flower heads are collected, a spikelet being removed from each and measured. If the biologist now tries to ‘make up the number’ by removing a further 20 spikelets from one plant, these observations are not independent of each other even if the plant itself is randomly selected. A genetic peculiarity in the plant that affects the size of one spikelet is likely to affect them all. This may distort the sample (see also Section 13.4).

2.6 Statistics and parameters

The measures which describe a variable of a sample are called statistics. It is from the sample statistics that the parameters of a population are estimated. Thus, the average mass of a random sample of voles is the statistic which is used to estimate the average mass parameter of the population. The average number of cockles in a random sample of quadrats estimates the average number of cockles per quadrat in the whole population of quadrats.
Hypothetical populations have hypothetical parameters. The average number of beetles in 10 randomly placed pit-fall traps estimates the average number of beetles per trap if the whole habitat had been covered by traps, in which case there are no beetles left to count! Samples from hypothetical populations are generally used for comparative purposes, for example to compare one woodland type with another.
In estimating a population parameter from a sample statistic, the number of units in a sample can be critical. Some statistical methods depend on a minimum number of sampling units and, where this is the case, it should be borne in mind before commencing fieldwork. Whilst it is true that larger samples will invariably result in greater statistical confidence, there is nevertheless a ‘diminishing returns’ effect. In many cases the time, effort and expense involved in collecting very large samples might be better spent in extending the study in other directions. We offer guidance as to what constitutes a suitable sample size for each statistical test as it is described.

2.7 Descriptive and inferential statistics

Descriptive statistics are used to organize, summarize and describe measures of a sample. No predictions or inferences are made regarding population parameters. Inferential (or deductive) statistics, on the other hand, are used to infer or predict population parameters from sample measures. This is done by a process of inductive reasoning based on the mathematical theory of probability. Fortunately, only a very minimal knowledge of mathematical theory of probability is needed in order to apply the rules of the statistical methods, and the little that is needed will be explained. However, no one can predict exactly a population parameter from a sample statistic, but only indicate with a stated degree of confidence within what range it lies. The degree o′f confidence depends on the sample selection procedures and the statistical techniques used.

2.8 Parametric and non-parametric statistics

Statistical methods commonly used by biologists fall into one of two classes – parametric and non-parametric. Parametric methods are the oldest, and although m...

Table of contents