Mathematics

Standard Deviation

Standard deviation is a measure of the amount of variation or dispersion in a set of values. It indicates how much the values differ from the mean of the set. A higher standard deviation suggests greater variability, while a lower standard deviation indicates that the values are closer to the mean.

Written by Perlego with AI-assistance

11 Key excerpts on "Standard Deviation"

  • Probability, Statistics and Other Frightening Stuff
    • Alan Jones(Author)
    • 2018(Publication Date)
    • Routledge
      (Publisher)
    So, perhaps using either AAD or MAD as an indicator of the Optimistic and Pessimistic Value limits is not such mad idea after all!)

    3.4 Variance and Standard Deviation

    Probably the most widely used Measure of Dispersion is one known as the Standard Deviation. In order to understand what the Standard Deviation is, we are better beginning with the square of the Standard Deviation, which is known as the Variance.
    Definition 3.6 Variance of a Population
    The Variance of an entire set (population) of data values is a measure of the extent to which the data is dispersed around its Arithmetic Mean. It is calculated as the average of the squares of the deviations of each individual value from the Arithmetic Mean of all the values.
    For the Formula-philes: Definition of the Variance of a population
    Consider a range of n observations x1 , x2 , x3 , . . . xn
    Note: the symbol, σ2 , is one that is in common usage to portray the Variance of a population. If this were the variance of a set of sample data, it is common practice to use the abbreviation s2
  • Essentials of Business Research Methods
    • Joe F. Hair Jr., Michael Page, Niek Brunsveld(Authors)
    • 2019(Publication Date)
    • Routledge
      (Publisher)
    variance . It is useful for describing the variability of the distribution and is a good index of the degree of dispersion. The variance is equal to 0 if each and every respondent in the distribution is the same as the mean. The variance becomes larger as the observations tend to differ increasingly from one another and from the mean.
    Standard Deviation
    The variance is used often in statistics, but it does have a major drawback. The variance is a unit of measurement that has been squared. For example, if we measure the number of colas consumed in a day and wish to calculate an average for the sample of respondents, the mean will be the average number of colas, and the variance will be in squared numbers. To overcome the problem of having the measure of dispersion in squared units instead of the original measurement units, we use the square root of the variance, which is called the Standard Deviation. The Standard Deviation describes the spread or variability of the sample distribution values from the mean and is perhaps the most valuable index of dispersion.
    To obtain the squared deviation, we square the individual deviation scores before adding them (squaring a negative number produces a positive result). After the sum of the squared deviations is determined, the result is divided by the number of respondents minus 1. The number 1 is subtracted from the number of respondents to help produce an unbiased estimate of the Standard Deviation. If the estimated Standard Deviation is large, the responses in a sample distribution of numbers do not fall very close to the mean of the distribution. If the estimated Standard Deviation is small, you know that the distribution values are close to the mean.
  • Biostatistics Decoded
    • A. Gouveia Oliveira(Author)
    • 2020(Publication Date)
    • Wiley
      (Publisher)
    Actually, they pose so many problems that it is standard mathematical practice to square a value when one wants the sign removed. Let us apply that method to the mean deviation. Instead of using the absolute value of the differences about the mean, let us square those differences and average the results. We will get a quantity that is also a measure of dispersion. This quantity is called the variance. The way to compute the variance is, therefore, first to find the mean, then subtract each value from the mean, square the result, and add all those values. The resulting quantity is called the sum of squares about the mean, or just the sum of squares. Finally, we divide the sum of squares by the number of observations to get the variance. Because the differences are squared, the variance is also expressed as a square of the attribute’s units, something strange like mmol 2 /l 2. This is not a problem when we use the variance for calculations, but when in presentations it would be rather odd to report squared units. To put things right we have to convert these awkward units into the original units by taking the square root of the variance. This new result is also a measure of dispersion and is called the Standard Deviation. As a measure of dispersion, the Standard Deviation is single valued and stable, but what can be said about its interpretability? Let us see: the Standard Deviation is the square root of the average of the squared differences between individual values and the mean. It is not easy to understand what this quantity really represents. However, the Standard Deviation is the most popular of all measures of dispersion. Why is that? One important reason is that the Standard Deviation has a large number of interesting mathematical properties. The other important reason is that, actually, the Standard Deviation has a straightforward interpretation, very much along the lines given earlier to the value of the mean deviation
  • Statistical Techniques for Data Analysis
    2 . The symbol V is sometimes used to designate variance.
    Ordinarily one is not dealing with a population, but rather with a sample of n individuals of the population. The individual measured values may be indicated by the symbols X1 , X2 , Xn .
    The sample mean, (called X bar), calculated as shown in the figure, is hopefully a good estimate of the population mean–that is why the measurements were made in the first place! One can calculate the sample Standard Deviation, s, using the formula shown in the figure. Likewise one can calculate the sample variance, s2 , as shown. Of course, one can use a simple calculator to do this, as indicated by PUSH BUTTON. This convenience is good because arithmetic is hard work and one may make mistakes. However, one should make a few calculations by the formula just to understand and appreciate what the calculator is doing.
    Remember that s is an estimate of the Standard Deviation of the population and that it is not σ. It is often called “the Standard Deviation”, maybe because the term ‘estimate of the Standard Deviation’ is cumbersome. It is, of course, the sample-based Standard Deviation but that term is also cumbersome. The Standard Deviation and its estimates always have the same units as those for X. When considering variability, a dimensionless quantity, the coefficient of variation, cv, is frequently encountered. It is simply
    Figure 4.1. Population values and sample estimates.
    If one knows cv and the level, X, s can be calculated.
    Figure 4.2. Distribution of means.
    Another term frequently used is called the relative Standard Deviation, RSD, and it is calculated as
    RSD = cv×100
    The relative Standard Deviation is thus expressed as a percent. There could be room for confusion when results are reported on a percentage basis, as the percentage of sulfur in a coal sample, for example. Here the value for s could be in units of percent. In such cases, one can make the distinction by using the terms % relative and % absolute.
  • Integrative Statistics for the Social and Behavioral Sciences
    2.    The Standard Deviation is sensitive to each score in the distribution. That is, since every score is included in the calculation of the Standard Deviation, if one score is changed at all, the value of the Standard Deviation is changed, unlike our first measure of variability, the range.
    3.    Like the mean, the Standard Deviation is not strongly influenced by sampling variation. If you (conceptually) take repeated samples of a population and calculate the Standard Deviation for each sample, the Standard Deviation will be relatively similar in each sample. That is, there will be very little variation in the estimates of Standard Deviation that is simply due to sampling choices.
    4.    The Standard Deviation—or, more specifically, variance—can be manipulated algebraically, which makes it useful for inferential statistics. Standard Deviations cannot be added and averaged where variances can be. Later, when we will be working with multiple samples and therefore multiple estimates of variation, this property will become important.

    SYMMETRY

    Many statisticians prefer to get a first feel for their data by looking at them visually rather than with the descriptive statistics we have already discussed in this chapter. It is important to know what your data look like, to be able to later determine the appropriate ways to use statistics to test hypotheses. Central tendency is important (e.g., where is the center of your data relative to that of a control group?), as is the degree of variability. Another important consideration is
    symmetry:
    the degree to which the data are distributed equally above and below the center. The degree of symmetry will be important in assumptions for testing hypotheses.
    Symmetry: A distribution that is identically shaped on either side of the mean, or mirror images of one another.
    A frequency distribution graph can quickly demonstrate the range of the scores as well as the frequency of occurrence of each score in the sample or population and can even provide a preliminary indication of the central score. If a distribution of scores is unimodal (has only one mode) and symmetrical (identical on each side of the mean), then the mean = median = mode. This is a way to tell, from the descriptive statistics, whether your data are unimodal and symmetrical, an important characteristic in a later chapter. Note that in the following Excel example, the mean, median, and mode all equal 5.
  • Statistics
    eBook - ePub

    Statistics

    The Essentials for Research

    original score units.
    The symbol for the Standard Deviation is σ, a lower case Greek sigma. The formula for the Standard Deviation is simply the square root of the formula for the variance.
    Formula 4.4 Standard Deviation
    The most convenient formula for computing σ is found by taking the square root of the raw-score formula for computing the variance. This formula becomes:
    Formula 4.5 Computing formula for the Standard Deviation
    We shall calculate the Standard Deviation for the distributions in Tables 4.4 and 4.5 . In the first table, the data are ungrouped, just as they might have come from an experiment; in the second table a different and larger set of data has already been grouped.
    Table 4.4Calculation of the Standard Deviation from llngrouped Data
    Table 4.5Calculation of the Standard Deviation from Grouped Data
    We have discussed three measures of variability: R, σ2 , and σ, emphasizing the latter two. The reason for this emphasis is that σ2 and σ have some very important mathematical properties, consequently these measures of dispersion are much more widely used than R
  • Introducing Social Statistics
    • Richard Startup, Elwyn T. Whittaker(Authors)
    • 2021(Publication Date)
    • Routledge
      (Publisher)
    deviation =
    i = 1
    k
    f i
    x i
    -
    x ¯
    i = 1
    k
    f i
    ( 3.3 )
    In our experience, it is not very often that one needs to make use of the mean deviation. Though it is a useful descriptive measure of variation, it suffers from two major defects. First, deviations taken irrespective of sign are not easily manipulated algebraically. Secondly, the mean deviation is not easily interpreted theoretically, so it is not very suitable for use in further statistical work. Part of its value here has been as a stepping-stone to help us reach the measures of variation which are theoretically superior.

    The Variance and the Standard Deviation

    If we are unable to use the moduli of the deviations from the mean, how else can the problem of the zero-sum be overcome? The mathematician will quickly provide the answer. Square the deviations, he will say, and then find the mean of these squared deviations. The quantity we thus obtain is called the variance , and it is the basic measure of variation in all but the most elementary statistical work. In the Σ notation, for raw data
    variance =
    i = 1
    n
    (
    x i
    -
    x ¯
    )
    2
    n
    ( 3.4 )
    In verbal terms it is the mean squared deviation (i.e the mean of the squares of the deviations). The variance is perhaps the most important measure in statistics, but there is one major problem that may be encountered in trying to use it descriptively. The original observations (the x -values) will have been in certain units – possibly years or pounds sterling or marks expressed as percentages. The arithmetic mean will also have been in these same units, and so will the deviations from the mean. But the variance, based on squared deviations, will be in units squared and thus will not be comparable with the original observations. Even though it has certain vital properties which will necessitate its use later in this book, for the time being we need a measure which is in the same units as our data, and to this end we take the square root of the variance. The new measure, called the Standard Deviation , and given the symbol S
  • Mastering Corporate Finance Essentials
    eBook - ePub

    Mastering Corporate Finance Essentials

    The Critical Quantitative Methods and Tools in Finance

    • Stuart A. McCrary(Author)
    • 2010(Publication Date)
    • Wiley
      (Publisher)
    The median is an alternative to the mean for establishing the general magnitude of data. The median is the point in a population or sample for which half of the remaining data is larger than the median and half of the remaining data is smaller than the median. In many cases, the mean and the median are approximately equal.
    Most applications of statistics in finance use the mean rather than the median. The examples in this book use the mean in all cases.

    Standard Deviation MEASURES THE NOISE

    The mean simplifies a collection of observations, but much of the detail is lost. Without seeing the individual observations, it is impossible to tell whether most observations are approximately equal to the mean or whether they vary tremendously. The most common measures of dispersion are variance and the closely related statistic called Standard Deviation. The observations in Figure 2.1 spread out over a mean of 9.5 percent. In fact, observations near 9.5 percent (including values slightly above and slightly below 9.5 percent) are the most commonly occurring observations. Observations below 8.5 percent or above 10.5 percent occur less frequently. Observations below 7.5 percent or above 11.5 percent are rare.
    FIGURE 2.1 Yields with 9.50 percent Mean and 1 percent Standard Deviation
    Figure 2.2 shows a similar set of observations. These observations are also centered around a mean of 9.5 percent. However, observations 1 or 2 percent above or below the mean are common.
    Table 2.1 is a histogram of the 1,000 returns used to create Figure 2.1 and Figure 2.2 . The first line contains a return of 2.5 percent followed by a count of zero instances of a return of 2.5 percent or lower in both Figures 2.1 and 2.2 . The following line contains a count of the number of returns between 2.5 percent and 3.5 percent. Figure 2.1 contains no returns between 2.5 percent and 3.5 percent, but Figure 2.2 contains four returns in that range.
    Notice that the number of returns close to 9.5 percent is higher for the returns in Figure 2.1 than for those in Figure 2.2 .
    Figure 2.3 displays the count of returns in Table 2.1 visually. As with Table 2.1 , it is clear that the returns in Figure 2.2 span a wider range. In other words, although the returns average about 9.5 percent in both cases, the chance that a return differs from the mean is much higher for the data in Figure 2.2 than for the data in Figure 2.1
  • Understanding Statistics in the Behavioral Sciences
    • Roger Bakeman, Byron F. Robinson(Authors)
    • 2005(Publication Date)
    • Psychology Press
      (Publisher)
    Standard Deviation. To compute the Standard Deviation for a sample of scores, first compute (a) the variance and then compute (b) its square root. Symbolically, the sample Standard Deviation for the Y scores is:
    S
    D Y
    =
    V A
    R Y
    =
    Σ (
    Y i
    -
    M Y
    ) 2
    N
    (
    w h e r e
    i = 1 , N )
    (5.6)
    The variance is an average sum of squares, so the units for variance are the units for the initial scores squared. For example, if the initial scores were feet, then variance would be measured in square feet and analogous to a measurement of area. Taking the square root of the variance, which is how the Standard Deviation is computed, means the units for the Standard Deviation are again the same as those used initially. Thus the Standard Deviation is like an average error, expressed in the same units as those used for the raw scores. The larger the deviations from the mean are (i.e., the more scores are spread out instead of clustering near the mean), the larger is the Standard Deviation.
    You may wonder why the square root of the variance has come to be called the standard deviation; why it is used to represent an average error? Certainly the average of the absolute values of the deviation scores (column D in the Fig. 5.3 spreadsheet) would be a logical candidate for a measure of the typical deviation or error. The reason the Standard Deviation has become standard has to do with its technical statistical properties. These properties are not shared by the mean absolute deviation. For now, and until you read further, accept this on faith but know that there are reasons for this that experts find acceptable.

    Sample and Population Standard Deviation

    Equation 5.6
  • Uncertainty Analysis of Experimental Data with R
    For example, how many possible values of temperature are there in the temperature range of 0–100 K? Of course, there are an infinite number of values in this range (or any finite temperature range). Because we cannot measure all of these values, we take a sample composed of a finite number of measurements and use this sample to infer characteristics about the population. There is also the possibility that the population is composed of discrete elements, though this is not common in physical science and engineering, and we will not consider this here. When we are analyzing data (i.e., samples), we are usually interested, at least in the beginning, in the following: Calculating the mean, median, Standard Deviation, and variance of a sample Evaluating covariances and correlations between variables Visualizing the data to get a sense of its distribution and trends Using the sample to estimate statistics of the population Identifying potential outliers We consider these topics, and others, in the following. 3.2 Mean, Median, Standard Deviation, and Variance of a Sample The sample mean (arithmetic average) x ¯ is defined as (3.1) x ¯ = 1 N ∑ i = 1 N x i, where N is the number of data points x i is data point i The median x m is found by ordering the data from largest to smallest and then defining x m to be the middle data point if N is an odd number or. the arithmetic average of the two middle data points if N is even, i.e., (3.2) x m = x (N + 1) / 2, N odd, (3.3) x m = x N / 2 + x N / 2 + 1 2, N even. The variance of a sample, s x 2, is defined as (3.4) s x 2 = 1 N − 1 ∑ i = 1 N (x i − x ¯) 2, and the sample Standard Deviation s x is simply the square root of the. variance: (3.5) s x = (s x 2) 1 / 2 = (1 N − 1 ∑ i = 1 N (x i − x ¯) 2) 1 / 2. The Standard Deviation is a measure of the average spread of the data about the mean. It is to be noted that in some texts, the N − 1 terms in these equations are replaced with N
  • 5 lb. Book of GRE Practice Problems, Fourth Edition: 1,800+ Practice Problems in Book and Online (Manhattan Prep 5 lb)
    X.
    The second statement is not true. The probability that any normally distributed variable falls within 2 Standard Deviations of its mean is the same, approximately 0.14 + 0.34 + 0.34 + 0.14 = 0.96, or 96%. Memorize this value for the GRE.
    The third statement is true. The mean of a normal curve is the point along the horizontal axis below the “peak” of the curve. The highest point of curve B is clearly to the right of the highest point of curve A, so the mean of Y is larger than the mean of X. Notice that the mean has nothing to do with the height of the normal curve, which only corresponds to how tightly the variable is gathered around the mean (i.e., how small the Standard Deviation is).
    23. (A). There are 400 test scores distributed among 50 possible outcomes (integers between 151 and 200, inclusive, which number 200 − 151 + 1 = 50 integers). There is an average of 400 ÷ 50 = 8 scores per integer outcome, and there are 400 ÷ 100 = 4 scores in each percentile. So, if all the scores were completely evenly distributed with exactly 8 scores per integer, there would be two percentile groups per integer outcome (0th and 1st percentiles at 151, 2nd and 3rd percentiles at 152, etc.). In that case, all 50 integers from 151 to 200 would correspond to more than one percentile group.
    Reduce the number of integers corresponding to more than one percentile group by bunching up the scores. Imagine that everyone gets a 157. Then that integer is the only one that corresponds to more than one percentile group (it corresponds to all 100 groups, in fact). However, don’t reduce further this way. This gives exactly 1 integer, so the minimum number of integers corresponding to more than one percentile group is 1, which is Quantity A.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.