Technology & Engineering

Mean Value and Standard Deviation

The mean value is the average of a set of numbers, calculated by adding all the values together and then dividing by the total count. It represents the central tendency of the data. The standard deviation measures the amount of variation or dispersion in a set of values. It indicates how much the values differ from the mean.

Written by Perlego with AI-assistance

Related key terms

Mean and Variance of Poisson Distributions

Mean Median and Mode

Measures of Central Tendency

Standard Deviation

Standard Deviation of Random Variable

Statistical Measures

Variance and Standard Deviation

10 Key excerpts on "Mean Value and Standard Deviation"

eBook - ePub
The ASQ Certified Quality Auditor Handbook
- Lance B. Coleman(Author)
- 2020(Publication Date)
- ASQ Quality Press
  (Publisher)
range is the simplest measure of dispersion. It is the difference between the maximum and minimum values in an observed data set. Since it is based on only two values from a data set, the measurement of range is most useful when the number of observations or values is small (ten or fewer).
Standard Deviation
Standard deviation , the most important measure of variation, measures the extent of dispersion around the zone of central tendency. For samples from a normal distribution, it is defined as the resulting value of the square root of the sum of the squares of the observed values, minus the arithmetic mean (numerator), divided by the total number of observations, minus one (denominator). The standard deviation of a sample of data is given as:

s = standard deviation

n = number of samples (observations or data points)

X = value measured

= average value measured
Coefficient of Variation
The final measure of dispersion, coefficient of variation , is the standard deviation divided by the mean. Variance is the guaranteed existence of a difference between any two items or observations. The concept of variation states that no two observed items will ever be identical.
Frequency Distributions
A frequency distribution is a tool for presenting data in a form that clearly demonstrates the relative frequency of the occurrence of values as well as the central tendency and dispersion of the data. Raw data are divided into classes to determine the number of values in a class or class frequency. The data are arranged by classes, with the corresponding frequencies in a table called a frequency distribution. When organized in this manner, the data are referred to as grouped data, as in Table 20.1 .

The data in this table appear to be normally distributed. Even without con- structing a histogram or calculating the average, the values appear to be centered around the value 18. In fact, the arithmetic average of these values is 18.02.

The histogram in Figure 20.1
Sign up to read
Learn more about book
eBook - ePub
Uncertainty Analysis of Experimental Data with R
- Benjamin David Shaw(Author)
- 2017(Publication Date)
- Chapman and Hall/CRC
  (Publisher)
For example, how many possible values of temperature are there in the temperature range of 0–100 K? Of course, there are an infinite number of values in this range (or any finite temperature range). Because we cannot measure all of these values, we take a sample composed of a finite number of measurements and use this sample to infer characteristics about the population. There is also the possibility that the population is composed of discrete elements, though this is not common in physical science and engineering, and we will not consider this here. When we are analyzing data (i.e., samples), we are usually interested, at least in the beginning, in the following: Calculating the mean, median, standard deviation, and variance of a sample Evaluating covariances and correlations between variables Visualizing the data to get a sense of its distribution and trends Using the sample to estimate statistics of the population Identifying potential outliers We consider these topics, and others, in the following. 3.2 Mean, Median, Standard Deviation, and Variance of a Sample The sample mean (arithmetic average) x ¯ is defined as (3.1) x ¯ = 1 N ∑ i = 1 N x i, where N is the number of data points x i is data point i The median x m is found by ordering the data from largest to smallest and then defining x m to be the middle data point if N is an odd number or. the arithmetic average of the two middle data points if N is even, i.e., (3.2) x m = x (N + 1) / 2, N odd, (3.3) x m = x N / 2 + x N / 2 + 1 2, N even. The variance of a sample, s x 2, is defined as (3.4) s x 2 = 1 N − 1 ∑ i = 1 N (x i − x ¯) 2, and the sample standard deviation s x is simply the square root of the. variance: (3.5) s x = (s x 2) 1 / 2 = (1 N − 1 ∑ i = 1 N (x i − x ¯) 2) 1 / 2. The standard deviation is a measure of the average spread of the data about the mean. It is to be noted that in some texts, the N − 1 terms in these equations are replaced with N
Sign up to read
Learn more about book
eBook - ePub
Maths in Chemistry
Numerical Methods for Physical and Analytical Chemistry
- Prerna Bansal(Author)
- 2020(Publication Date)
- De Gruyter
  (Publisher)
A measure of central tendency in statistics indicates the attempt to find the central position of the data set or distribution. They are sometimes also called the measures of central location or summary statistics. These measures depict where the most values of a data set or distribution lies or which is called the central location of distribution. This central value is the representative of the whole distribution. So, by the definition, the mean or average should be representative of measure of central tendency but there are other representatives as well, namely, median and mode, since mean often is not sufficient or even sometimes misleading. Hence, under different conditions and different types of data, different types of measures of central tendency are used. The central value is a single value which gives the idea of approximation of normality. The measures of dispersion of data are quantified by standard deviation and variance, while the minimum and maximum values of the data are explained using skewness and kurtosis.

These descriptors are useful especially in research where they can meaningfully interpret any experiment. Also, they reduce the bulkiness of data and organize them better. One usually prefers to analyse data and reduce the error, but reducing the data and then analysing the error sounds more intelligent way. One cannot always end the error, but can reduce it by improvising the sources of error. But sometimes the error is reduced by interpreting the data in a different way.

Instead of taking 10 observation values for an experiment, mean of those values is preferred (having more significant numbers), thereby reducing the data, and hence also minimizing the error. Hence, these statistical parameters are a way of data reduction.
Some of the frequently used descriptors of statistics are described here.

4.2 Mean
The mean or the average is the most common measure of central tendency. It is equal to sum of all the values in the data set divided by the number of observations. So, the mean can be written as
(4.1)
x ‾
=
x 1
+
x 2
+
x 3
+ ⋯ +
x n
n

or
(4.2)
x ‾
=
∑
i = 1
n
x n
n

where
x ‾
is the mean, x1 , x2 , x3 ,…, xn are the observations or the data points and n is the total number of data points, and ∑ (the Greek letter sigma) refers to the summation. The mean can be calculated for both continuous and discrete data. The above mean is called sample mean (
x ‾
) while population mean is represented by μ as

(4.3)
μ =
∑
i = 1
n
X
n

In statistics, sample and population holds different meaning although they both are calculated in the same manner. The mean includes all values in the data set which may or may not be outliers; hence, there is often the possibility of error.

In Table 4.1
Sign up to read
Learn more about book
eBook - ePub
Straightforward Statistics
- Patrick White(Author)
- 2023(Publication Date)
- Policy Press
  (Publisher)
Another way of thinking about the mean deviation is a measure of how good the mean would be as a prediction for the value of any of the other cases. The smaller the mean deviation, the more likely it is that the mean would be close to the value of any other case that you might choose. Or, to put it differently, smaller mean deviations suggest that, overall, the mean is a more representative measure of the values in the dataset.
The standard deviation
Even if you’ve done some statistics before or have read research reports containing the results of statistical analysis, you probably won’t have heard of the mean deviation. But you might have come across a similar measure, the standard deviation.
So what is the standard deviation? How is it different from the mean deviation? And why are you more likely to know about the standard deviation than the mean deviation?
The mean deviation (MD) and the standard deviation (SD) are both measures of spread. And they both tell you how much the values of a variable are spread out in general . In fact, they measure almost the same thing, and if you compare the MD and SD for any particular variable in a dataset, you’ll see that the two measures have very similar values.

Both the MD and the SD have been around for a long time, but the SD became more popular from the early 20th century, and has since been the measure that is commonly used in mainstream statistical analysis. The transition was quite controversial at the time and, according to Gorard (2005), the reasons for preferring the SD may not be relevant for practical research today.

The SD is more difficult to calculate than the MD. This isn’t really a problem for doing any actual analysis, as a computer will calculate the SD for you. But it does mean that, unless you’re confident with maths, showing you how it’s worked out is unlikely to help you understand what the SD is and why it can be useful.

Perhaps the most important difference between the MD and the SD is that the MD has a reasonably straightforward interpretation – the mean distance from the mean – and the SD does not. This is the reason I teach my students about the MD, even if I know lots of them won’t use it in practice: it’s easier to understand how the MD is calculated and what it means. And, as I explain below, it can be interpreted in almost the same way.
Sign up to read
Learn more about book
eBook - ePub
Statistical Process Control
A Guide for Implementation
- Roger W. Berger, Thomas H. Hart(Authors)
- 2020(Publication Date)
- CRC Press
  (Publisher)
Any and all computations of these factors. Χ. X. R. R. median and mode, always follow these same computational steps. There may be a little more adding and a little more subtracting depending on the size and number of samples, but the computational steps are always the same. Elementary arithmetic.

VARIANCE AND STANDARD DEVIATION

At this point in our discussion of statistical concepts there are only three more points to cover: variance, standard deviation and distribution patterns. It is difficult to grasp variance and standard deviation without understanding distribution patterns, and distribution patterns arc of limited use without the variance and standard deviation. As a result, you may have to jump back and forth between the two sections for reference.

Variance is a measure of the difference of individual observations from the average (mean) value. You will seldom see the term used in relation to statistical process control. The computations involved in finding the variance, while not particularly complicated, are tedious and we will leave the subject without investigation. It should be noted, though, that variance measurements are used when making comparisons between the variableness of two similar processing lines making the same or similar parts. If you have an interest in this area of variance comparisons between similar and parallel processes, several of the texts noted in the reference section of the appendix have good presentations on the topic and examples of the equations and computations involved.

The measure of variability used most often in SPC is the standard deviation, denoted by σ (sigma). The standard deviation is the square root of the variance. It yields a measure of population variability which can then help estimate the percent of the parent population which falls within a given number of standard deviations from the mean (average).
Sign up to read
Learn more about book
eBook - ePub
Statistical Techniques for Data Analysis
- John K. Taylor, Cheryl Cihon(Authors)
- 2004(Publication Date)
- Chapman and Hall/CRC
  (Publisher)
2 . The symbol V is sometimes used to designate variance.

Ordinarily one is not dealing with a population, but rather with a sample of n individuals of the population. The individual measured values may be indicated by the symbols X1 , X2 , Xn .

The sample mean, (called X bar), calculated as shown in the figure, is hopefully a good estimate of the population mean–that is why the measurements were made in the first place! One can calculate the sample standard deviation, s, using the formula shown in the figure. Likewise one can calculate the sample variance, s2 , as shown. Of course, one can use a simple calculator to do this, as indicated by PUSH BUTTON. This convenience is good because arithmetic is hard work and one may make mistakes. However, one should make a few calculations by the formula just to understand and appreciate what the calculator is doing.

Remember that s is an estimate of the standard deviation of the population and that it is not σ. It is often called “the standard deviation”, maybe because the term ‘estimate of the standard deviation’ is cumbersome. It is, of course, the sample-based standard deviation but that term is also cumbersome. The standard deviation and its estimates always have the same units as those for X. When considering variability, a dimensionless quantity, the coefficient of variation, cv, is frequently encountered. It is simply

Figure 4.1. Population values and sample estimates.

If one knows cv and the level, X, s can be calculated.

Figure 4.2. Distribution of means.

Another term frequently used is called the relative standard deviation, RSD, and it is calculated as
RSD = cv×100

The relative standard deviation is thus expressed as a percent. There could be room for confusion when results are reported on a percentage basis, as the percentage of sulfur in a coal sample, for example. Here the value for s could be in units of percent. In such cases, one can make the distinction by using the terms % relative and % absolute.
Sign up to read
Learn more about book
eBook - ePub
Mastering Corporate Finance Essentials
The Critical Quantitative Methods and Tools in Finance
- Stuart A. McCrary(Author)
- 2010(Publication Date)
- Wiley
  (Publisher)
The median is an alternative to the mean for establishing the general magnitude of data. The median is the point in a population or sample for which half of the remaining data is larger than the median and half of the remaining data is smaller than the median. In many cases, the mean and the median are approximately equal.
Most applications of statistics in finance use the mean rather than the median. The examples in this book use the mean in all cases.
STANDARD DEVIATION MEASURES THE NOISE

The mean simplifies a collection of observations, but much of the detail is lost. Without seeing the individual observations, it is impossible to tell whether most observations are approximately equal to the mean or whether they vary tremendously. The most common measures of dispersion are variance and the closely related statistic called standard deviation. The observations in Figure 2.1 spread out over a mean of 9.5 percent. In fact, observations near 9.5 percent (including values slightly above and slightly below 9.5 percent) are the most commonly occurring observations. Observations below 8.5 percent or above 10.5 percent occur less frequently. Observations below 7.5 percent or above 11.5 percent are rare.

FIGURE 2.1 Yields with 9.50 percent Mean and 1 percent Standard Deviation

Figure 2.2 shows a similar set of observations. These observations are also centered around a mean of 9.5 percent. However, observations 1 or 2 percent above or below the mean are common.

Table 2.1 is a histogram of the 1,000 returns used to create Figure 2.1 and Figure 2.2 . The first line contains a return of 2.5 percent followed by a count of zero instances of a return of 2.5 percent or lower in both Figures 2.1 and 2.2 . The following line contains a count of the number of returns between 2.5 percent and 3.5 percent. Figure 2.1 contains no returns between 2.5 percent and 3.5 percent, but Figure 2.2 contains four returns in that range.

Notice that the number of returns close to 9.5 percent is higher for the returns in Figure 2.1 than for those in Figure 2.2 .

Figure 2.3 displays the count of returns in Table 2.1 visually. As with Table 2.1 , it is clear that the returns in Figure 2.2 span a wider range. In other words, although the returns average about 9.5 percent in both cases, the chance that a return differs from the mean is much higher for the data in Figure 2.2 than for the data in Figure 2.1
Sign up to read
Learn more about book
eBook - ePub
The Engineering Design Primer
- K. L. Richards(Author)
- 2020(Publication Date)
- CRC Press
  (Publisher)
14
Statistical Methods for Engineers

14.1 Definitions for Some Terms Used in Statistics

14.1.1 Population
This is the term applied to the whole group that is being studied.

14.1.2 Sample

If a number of items that are representative of the population are withdrawn for the purpose of obtaining the data of the study, the items as a group are called a Sample. A random sample is one where every item in a population has an equal chance of being selected for the sample.

14.1.3
Variate (xr )
This is the value of a characteristic of the population. For example, height, length, weight, number of rejects, etc., are some of the many variates used. Variates may be continuous or discrete.

14.1.3.1 Continuous Variates

A continuous variable is a variable that has an infinite number of possible values. In other words, any value is possible for the variable. A continuous variate may be continuously subdivided such as length and time.

14.1.3.2 Discrete Variates

A discrete variable is a variable that can only take on a certain number of values. In other words, they don’t have an infinite number of values. If you can count a set of items, then it is a discrete variable.

14.1.4
Frequency (fr )
This is the number of occasions in which a particular numerical value of the variate occurs in a sample.

14.1.5 Mean (M) (Arithmetic Mean Average)

The arithmetic mean is a mathematical representation of the typical value of a series of numbers, computed as the sum of all the numbers in the series divided by the count of all numbers in the series. The arithmetic mean is sometimes referred to as the average or simply as the mean.

As an example: The sum of all of the numbers in a list divided by the number of items in that list. For example, the mean of the numbers 2, 3, 7 is 4 since 2 + 3 + 7 = 12 and 12 divided by 3 (there are three numbers) is 4.

14.1.6 Mode

The mode of a set of data values is the value that appears most often. It is the value x at which its probability mass function takes its maximum value. In other words, it is the value that is most likely to be sampled.
Sign up to read
Learn more about book
eBook - ePub
Integrative Statistics for the Social and Behavioral Sciences
- Renee R. Ha, James C. Ha(Authors)
- 2011(Publication Date)
- SAGE Publications, Inc
  (Publisher)
2.    The standard deviation is sensitive to each score in the distribution. That is, since every score is included in the calculation of the standard deviation, if one score is changed at all, the value of the standard deviation is changed, unlike our first measure of variability, the range.

3.    Like the mean, the standard deviation is not strongly influenced by sampling variation. If you (conceptually) take repeated samples of a population and calculate the standard deviation for each sample, the standard deviation will be relatively similar in each sample. That is, there will be very little variation in the estimates of standard deviation that is simply due to sampling choices.

4.    The standard deviation—or, more specifically, variance—can be manipulated algebraically, which makes it useful for inferential statistics. Standard deviations cannot be added and averaged where variances can be. Later, when we will be working with multiple samples and therefore multiple estimates of variation, this property will become important.

SYMMETRY

Many statisticians prefer to get a first feel for their data by looking at them visually rather than with the descriptive statistics we have already discussed in this chapter. It is important to know what your data look like, to be able to later determine the appropriate ways to use statistics to test hypotheses. Central tendency is important (e.g., where is the center of your data relative to that of a control group?), as is the degree of variability. Another important consideration is
symmetry:
the degree to which the data are distributed equally above and below the center. The degree of symmetry will be important in assumptions for testing hypotheses.

Symmetry: A distribution that is identically shaped on either side of the mean, or mirror images of one another.

A frequency distribution graph can quickly demonstrate the range of the scores as well as the frequency of occurrence of each score in the sample or population and can even provide a preliminary indication of the central score. If a distribution of scores is unimodal (has only one mode) and symmetrical (identical on each side of the mean), then the mean = median = mode. This is a way to tell, from the descriptive statistics, whether your data are unimodal and symmetrical, an important characteristic in a later chapter. Note that in the following Excel example, the mean, median, and mode all equal 5.
Sign up to read
Learn more about book
eBook - ePub
Essentials of Business Research Methods
- Joe F. Hair Jr., Michael Page, Niek Brunsveld(Authors)
- 2019(Publication Date)
- Routledge
  (Publisher)
variance . It is useful for describing the variability of the distribution and is a good index of the degree of dispersion. The variance is equal to 0 if each and every respondent in the distribution is the same as the mean. The variance becomes larger as the observations tend to differ increasingly from one another and from the mean.

Standard Deviation

The variance is used often in statistics, but it does have a major drawback. The variance is a unit of measurement that has been squared. For example, if we measure the number of colas consumed in a day and wish to calculate an average for the sample of respondents, the mean will be the average number of colas, and the variance will be in squared numbers. To overcome the problem of having the measure of dispersion in squared units instead of the original measurement units, we use the square root of the variance, which is called the standard deviation. The standard deviation describes the spread or variability of the sample distribution values from the mean and is perhaps the most valuable index of dispersion.

To obtain the squared deviation, we square the individual deviation scores before adding them (squaring a negative number produces a positive result). After the sum of the squared deviations is determined, the result is divided by the number of respondents minus 1. The number 1 is subtracted from the number of respondents to help produce an unbiased estimate of the standard deviation. If the estimated standard deviation is large, the responses in a sample distribution of numbers do not fall very close to the mean of the distribution. If the estimated standard deviation is small, you know that the distribution values are close to the mean.
Sign up to read
Learn more about book

Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.

Explore more topic indexes

Biological Sciences

Computer Science

Languages & Linguistics

Politics & International Relations

Social Sciences