Business

Variance and Standard Deviation

Variance and standard deviation are measures of the spread or dispersion of a set of data points. Variance quantifies the average squared difference of each data point from the mean, while standard deviation is the square root of the variance and provides a measure of how much the data deviates from the mean. In business, these measures are used to assess risk and variability in financial data.

Written by Perlego with AI-assistance

12 Key excerpts on "Variance and Standard Deviation"

  • Essentials of Business Research Methods
    • Joe F. Hair Jr., Michael Page, Niek Brunsveld(Authors)
    • 2019(Publication Date)
    • Routledge
      (Publisher)
    variance . It is useful for describing the variability of the distribution and is a good index of the degree of dispersion. The variance is equal to 0 if each and every respondent in the distribution is the same as the mean. The variance becomes larger as the observations tend to differ increasingly from one another and from the mean.
    Standard Deviation
    The variance is used often in statistics, but it does have a major drawback. The variance is a unit of measurement that has been squared. For example, if we measure the number of colas consumed in a day and wish to calculate an average for the sample of respondents, the mean will be the average number of colas, and the variance will be in squared numbers. To overcome the problem of having the measure of dispersion in squared units instead of the original measurement units, we use the square root of the variance, which is called the standard deviation. The standard deviation describes the spread or variability of the sample distribution values from the mean and is perhaps the most valuable index of dispersion.
    To obtain the squared deviation, we square the individual deviation scores before adding them (squaring a negative number produces a positive result). After the sum of the squared deviations is determined, the result is divided by the number of respondents minus 1. The number 1 is subtracted from the number of respondents to help produce an unbiased estimate of the standard deviation. If the estimated standard deviation is large, the responses in a sample distribution of numbers do not fall very close to the mean of the distribution. If the estimated standard deviation is small, you know that the distribution values are close to the mean.
  • Probability, Statistics and Other Frightening Stuff
    • Alan Jones(Author)
    • 2018(Publication Date)
    • Routledge
      (Publisher)
    Figure 3.6 .)
    Examination of the Variance formula highlights that each term in the summation series is merely defining an area of a square sitting on the diagonal defined by the distance of each point from the Arithmetic Mean of all points. The Variance is simply the area of the average square (highlighted in yellow in Figure 3.6 .) The Standard Deviation, as the Square Root of the Variance, is simply the length of a side of this average area which defines the Variance. Table 3.4 shows the calculations.
    Figure 3.6
    Basis of the Variance Calculation
    Table 3.4
    Example of Variance and Standard Deviation Calculation
    Definition 3.7 Standard Deviation of a Population
    The Standard Deviation of an entire set (population) of data values is a measure of the extent to which the data is dispersed around its Arithmetic Mean. It is calculated as the square root of the Variance, which is the average of the squares of the deviations of each individual value from the Arithmetic Mean of all the values.
    For the Formula-philes: Definition of the Standard Deviation of a population
    Consider a range of n observations x1 , x2 , x3 , . . . xn
    Note: the symbol, σ , is one that is in common usage to portray the Standard Deviation of a population. If this were the variance of a set of sample data, it is common practice to use the abbreviation, s
    Figure 3.7
    Equally Dispersed Data Around Different Means
    As an area, the unit of measurement of the Variance is the square of the unit of measurement of the raw data, but the unit of measurement of the Standard Deviation is the same as that of the raw data.
    For both the Standard Deviation and the Variance, the values only have true meaning in relation to the value of the Arithmetic Mean. Low values of Variance and Standard Deviation only imply tightly nested data if the value of the Arithmetic Mean is large. Even then, the combination of Arithmetic Mean and Variance/Standard Deviation does not tell the whole story. Figure 3.7 illustrates two Normal Distributions (see Chapter 4
  • The ASQ Certified Quality Auditor Handbook
    range is the simplest measure of dispersion. It is the difference between the maximum and minimum values in an observed data set. Since it is based on only two values from a data set, the measurement of range is most useful when the number of observations or values is small (ten or fewer).
    Standard Deviation
    Standard deviation , the most important measure of variation, measures the extent of dispersion around the zone of central tendency. For samples from a normal distribution, it is defined as the resulting value of the square root of the sum of the squares of the observed values, minus the arithmetic mean (numerator), divided by the total number of observations, minus one (denominator). The standard deviation of a sample of data is given as:
    s = standard deviation
    n = number of samples (observations or data points)
    X = value measured
    = average value measured
    Coefficient of Variation
    The final measure of dispersion, coefficient of variation , is the standard deviation divided by the mean. Variance is the guaranteed existence of a difference between any two items or observations. The concept of variation states that no two observed items will ever be identical.
    Frequency Distributions
    A frequency distribution is a tool for presenting data in a form that clearly demonstrates the relative frequency of the occurrence of values as well as the central tendency and dispersion of the data. Raw data are divided into classes to determine the number of values in a class or class frequency. The data are arranged by classes, with the corresponding frequencies in a table called a frequency distribution. When organized in this manner, the data are referred to as grouped data, as in Table 20.1 .
    The data in this table appear to be normally distributed. Even without con- structing a histogram or calculating the average, the values appear to be centered around the value 18. In fact, the arithmetic average of these values is 18.02.
    The histogram in Figure 20.1
  • Mastering Corporate Finance Essentials
    eBook - ePub

    Mastering Corporate Finance Essentials

    The Critical Quantitative Methods and Tools in Finance

    • Stuart A. McCrary(Author)
    • 2010(Publication Date)
    • Wiley
      (Publisher)
    Figure 2.2 sample corresponds closely with the population average of 9.5 percent and a standard deviation of 2 percent.
    Although this standard deviation is twice as high as the standard deviation of the data in Figure 2.1 , the data still center around 9.5 percent and are clustered near that mean, but not as tightly as the data in Figure 2.1 . Taken together, the mean and the standard deviation summarize the magnitude of the observations and the general level of dispersion.

    ANNUALIZING Variance and Standard Deviation ESTIMATES

    The data series may be observed daily, monthly, or annually. A calculation of variance using annual data will be higher than the same calculation performed on monthly or daily data. For data such as securities prices or interest rates, it is typical for the variance calculated from annual returns to be approximately 12 times larger than the variance calculated from monthly returns.6 Similarly, the monthly variance will be 20-22 times larger than the variance of daily returns (if there are 20-22 business days in a month).
    Therefore:(2.6a)
    Note that there are about 251 days per year when markets are open. The number of business days per year differs around the world, and practitioners may make slightly different assumptions in adjusting Variance and Standard Deviations for the length of time.
    Since the standard deviation is the square root of the variance in Equation 2.6a , Equation 2.6b shows the relationship between the standard deviation of data observed annually, monthly, and daily.
    (2.6b)
    When simplified, Equation 2.6c shows that the standard deviation measured annually can be estimated from data observed daily or monthly if the daily and monthly measurements are adjusted by the square root of the measure of time.
    (2.6c)

    The Normal Distribution

    The familiar “bell curve” (also known as the Gaussian distribution) is commonly used in finance. Most of us first heard about the normal distribution when a teacher explained why there had to be an uncomfortably large number of C’s in a particular class and a surprisingly small number of A’s. In short, most of the class was getting a B because our grades were clustered together. The normal curve recognizes that tendency for data to fall close together near the mean.
  • Introducing Social Statistics
    • Richard Startup, Elwyn T. Whittaker(Authors)
    • 2021(Publication Date)
    • Routledge
      (Publisher)
    deviation =
    i = 1
    k
    f i
    x i
    -
    x ¯
    i = 1
    k
    f i
    ( 3.3 )
    In our experience, it is not very often that one needs to make use of the mean deviation. Though it is a useful descriptive measure of variation, it suffers from two major defects. First, deviations taken irrespective of sign are not easily manipulated algebraically. Secondly, the mean deviation is not easily interpreted theoretically, so it is not very suitable for use in further statistical work. Part of its value here has been as a stepping-stone to help us reach the measures of variation which are theoretically superior.

    The Variance and the Standard Deviation

    If we are unable to use the moduli of the deviations from the mean, how else can the problem of the zero-sum be overcome? The mathematician will quickly provide the answer. Square the deviations, he will say, and then find the mean of these squared deviations. The quantity we thus obtain is called the variance , and it is the basic measure of variation in all but the most elementary statistical work. In the Σ notation, for raw data
    variance =
    i = 1
    n
    (
    x i
    -
    x ¯
    )
    2
    n
    ( 3.4 )
    In verbal terms it is the mean squared deviation (i.e the mean of the squares of the deviations). The variance is perhaps the most important measure in statistics, but there is one major problem that may be encountered in trying to use it descriptively. The original observations (the x -values) will have been in certain units – possibly years or pounds sterling or marks expressed as percentages. The arithmetic mean will also have been in these same units, and so will the deviations from the mean. But the variance, based on squared deviations, will be in units squared and thus will not be comparable with the original observations. Even though it has certain vital properties which will necessitate its use later in this book, for the time being we need a measure which is in the same units as our data, and to this end we take the square root of the variance. The new measure, called the standard deviation , and given the symbol S
  • Statistics
    eBook - ePub

    Statistics

    The Essentials for Research

    original score units.
    The symbol for the standard deviation is σ, a lower case Greek sigma. The formula for the standard deviation is simply the square root of the formula for the variance.
    Formula 4.4 Standard Deviation
    The most convenient formula for computing σ is found by taking the square root of the raw-score formula for computing the variance. This formula becomes:
    Formula 4.5 Computing formula for the standard deviation
    We shall calculate the standard deviation for the distributions in Tables 4.4 and 4.5 . In the first table, the data are ungrouped, just as they might have come from an experiment; in the second table a different and larger set of data has already been grouped.
    Table 4.4Calculation of the Standard Deviation from llngrouped Data
    Table 4.5Calculation of the Standard Deviation from Grouped Data
    We have discussed three measures of variability: R, σ2 , and σ, emphasizing the latter two. The reason for this emphasis is that σ2 and σ have some very important mathematical properties, consequently these measures of dispersion are much more widely used than R
  • Statistical Techniques for Data Analysis
    2 . The symbol V is sometimes used to designate variance.
    Ordinarily one is not dealing with a population, but rather with a sample of n individuals of the population. The individual measured values may be indicated by the symbols X1 , X2 , Xn .
    The sample mean, (called X bar), calculated as shown in the figure, is hopefully a good estimate of the population mean–that is why the measurements were made in the first place! One can calculate the sample standard deviation, s, using the formula shown in the figure. Likewise one can calculate the sample variance, s2 , as shown. Of course, one can use a simple calculator to do this, as indicated by PUSH BUTTON. This convenience is good because arithmetic is hard work and one may make mistakes. However, one should make a few calculations by the formula just to understand and appreciate what the calculator is doing.
    Remember that s is an estimate of the standard deviation of the population and that it is not σ. It is often called “the standard deviation”, maybe because the term ‘estimate of the standard deviation’ is cumbersome. It is, of course, the sample-based standard deviation but that term is also cumbersome. The standard deviation and its estimates always have the same units as those for X. When considering variability, a dimensionless quantity, the coefficient of variation, cv, is frequently encountered. It is simply
    Figure 4.1. Population values and sample estimates.
    If one knows cv and the level, X, s can be calculated.
    Figure 4.2. Distribution of means.
    Another term frequently used is called the relative standard deviation, RSD, and it is calculated as
    RSD = cv×100
    The relative standard deviation is thus expressed as a percent. There could be room for confusion when results are reported on a percentage basis, as the percentage of sulfur in a coal sample, for example. Here the value for s could be in units of percent. In such cases, one can make the distinction by using the terms % relative and % absolute.
  • Sensory Evaluation of Food
    eBook - ePub

    Sensory Evaluation of Food

    Statistical Methods and Procedures

    • Michael O'Mahony(Author)
    • 2017(Publication Date)
    • CRC Press
      (Publisher)
    X scores that are present.
    Another way of ensuring that the deviations have positive values is to square them; this will tend to give greater weight to extreme values. Such an approach is generally used by statisticians to compute the most used measures of spread: the variance and the standard deviation.

    2.4 Variance and Standard Deviation

    The Variance and Standard Deviation are the most commonly used middle-range values, the most common measures of dispersion.
    Variance and Standard Deviation of Populations
    The variance of a population of numbers is given by a formula similar to that for the mean deviation:
    σ 2
    =
    Σ
    ( X μ )
    2
    N
    Squaring all the deviations from the mean, X μ values, ensures that they are all positive. It also makes σ 2 more sensitive than the mean deviation to large deviations, p is the population mean.
    Again, the variance of a sample could be calculated in the same way. It will, however, be smaller than the variance of the population. It is computed from the formula
    σ sample 2
    =
    Σ
    ( X
    X ¯
    )
    2
    N
    where
    X ¯
    is now the mean of the sample.
    The standard deviation is merely the square root of the variance. It is symbolized by σ rather than σ 2 . Hence the standard deviation of a population is given by
    σ =
    Σ
    ( X μ )
    2
    N
    Again, the standard deviation of a sample smaller than the population is given by
    σ sample
    =
    Σ
    ( X
    X ¯
    )
    2
    N
    Estimates of Population Variance and Standard Deviation from Samples
    Generally, however, we wish to estimate the variance or standard deviation of the population from the data in the sample. This is done by adjusting the formulas so that the denominator N is replaced by N − 1. We are not going to go into why this is so mathematically; suffice it to say that if we divide by N
  • Statistics for Compensation
    eBook - ePub

    Statistics for Compensation

    A Practical Guide to Compensation Analysis

    • John H. Davis(Author)
    • 2011(Publication Date)
    • Wiley
      (Publisher)
    Of course this is just a mathematical way to merge two scales, but it does form a nice objective starting point. Other considerations would have to be taken into account, including the wording of the level definitions and management concerns.
    5.5 Coefficient of Variation
    The standard deviation is a measure of the absolute variation of data values about their mean. The coefficient of variation (CV) is a measure of the relative variation of data values about their mean in terms of a percent. It allows a nice comparison of the variability of data sets that vary in magnitude.
    The CV is an “average” percent of the mean of the data points from the mean. It is calculated by dividing the standard deviation by the mean, and expressing the ratio as a percent.
    Using data from a sales survey in which BPD participated we get the following in Table 5.9 .
    Table 5.9 Coefficient of Variation for VP Sales Data.
    VP Sales Sales Associate
    Mean 162,300 42,560
    Standard deviation 26,200 6,660
    CV 16.1% 15.6%
    For the VP Sales position, CV = (26,200/162,300)(100) = 16.1%. A similar calculation is done for the Sales Associate data.
    In this example, the standard deviation of the salaries of VP Sales, in absolute dollars, is almost four times that of the salaries of Sales Associates. However, in relative terms, they are both about the same—the standard deviation is approximately 16% of the mean for both jobs. Hence, on a relative basis the variability is about the same.
    Like the standard deviation, the CV is necessarily impacted by outliers since it is based on calculations involving all the data points.
    Continuing with the example of administrative assistant and accounting assistant salaries, we have Table 5.10 .
    Table 5.10 Salary Survey of Two Jobs—Measures of Variation Summary with CV.
    Administrative Assistant Accounting Assistant
    No. of data points
  • Mathematics and Statistics for Financial Risk Management
    • Michael B. Miller(Author)
    • 2013(Publication Date)
    • Wiley
      (Publisher)
    , is called standard deviation. In finance we often refer to standard deviation as volatility. This is analogous to referring to the mean as the average. Standard deviation is a mathematically precise term, whereas volatility is a more general concept.
    SAMPLE PROBLEM Question:     A derivative has a 50/50 chance of being worth either +10 or −10 at expiry. What is the standard deviation of the derivative's value? Answer:
    = 0.50 10 + 0.50 (−10) = 0
    2 = 0.50 (10 − 0)2 + 0.50 (−10 − 0)2 = 0.5 100 + 0.5 100 = 100
    = 10
    In the previous example, we were calculating the population Variance and Standard Deviation. All of the possible outcomes for the derivative were known.
    To calculate the sample variance of a random variable X based on n observations, x 1 , x 2 , . . ., x n , we can use the following formula:
    (3.19)
    where is the sample mean as in Equation 3.2 . Given that we have n data points, it might seem odd that we are dividing the sum by (n − 1) and not n . The reason has to do with the fact that itself is an estimate of the true mean, which also contains a fraction of each x i . We leave the proof for a problem at the end of the chapter, but it turns out that dividing by (n − 1), not n , produces an unbiased estimate of 2 . If the mean is known or we are calculating the population variance, then we divide by n . If instead the mean is also being estimated, then we divide by n − 1.
    Equation 3.18 can easily be rearranged as follows (the proof of this equation is also left as an exercise):
    (3.20)
    Note that variance can be nonzero only if E [X ]2 E [X ]2 .
    When writing computer programs, this last version of the variance formula is often useful, since it allows us to calculate the mean and the variance in the same loop.
    In finance it is often convenient to assume that the mean of a random variable is equal to zero. For example, based on theory, we might expect the spread between two equity indexes to have a mean of zero in the long run. In this case, the variance is simply the mean of the squared returns.
  • Uncertainty Analysis of Experimental Data with R
    x is simply the square root of the variance:
    (3.5)
    s x
    =
    (
    s x 2
    )
    1 / 2
    =
    (
    1
    N 1
    i = 1
    N
    (
    x i
    x ¯
    )
    2
    )
    1 / 2
    .
    The standard deviation is a measure of the average spread of the data about the mean. It is to be noted that in some texts, the N − 1 terms in these equations are replaced with N. The approach to follow is to use N − 1 if you are going to use s
    x
    and
    s x 2
    to characterize the sample, which is what we do here.
    For illustration, we will calculate these basic statistics using R’s built-in functions for the following data set:
    > x <- c(2.3,4.2,3.2,4.1,1.1,5.4,3.3,4.4,3.7,2.7) #data > mean(x) # sample mean [1] 3.44 > median(x) # sample median [1] 3.5 > var(x) # sample variance [1] 1.471566 > sd(x) # sample standard deviation [1] 1.213077

    3.3Covariance and Correlation

    Covariance and correlation are important when we want to assess whether the variations in two variables are correlated, i.e., whether they move up and down together in some sense. If we have two data sets {x1 , x2 ,…, x
    N
    } and {y1 , y2 ,…, y
    N
    }, the covariance s
    xy
    is defined as
    (3.6)
    s
    x y
    =
    1
    N 1
    i = 1
    N
    (
    x i
    x ¯
    )
    (
    y i
    y ¯
    )
    and the correlation coefficient ρ
    xy
    is defined as
    (3.7)
    ρ
    x y
    =
    s
    x y
    s x
    s y
    .
    Here, each data set has the same number of elements and we have not reordered them in any way as this can influence the level of correlation. For example, sorting both x and y from smallest to largest will force them to be correlated. The correlation coefficient is always in the range −1 ≤ ρ
    xy
     ≤ 1. Two data sets are perfectly correlated if ρ
    xy
    = +1 or −1 and completely uncorrelated if ρ
    xy
    = 0, which means that s
    xy
    = 0. If ρ
    xy
    is “close” to +1 or −1, the data are mostly correlated, and if ρ
    xy
    is “close” to 0, the data are mostly uncorrelated. The covariance of a data set with itself is the variance of this data set, i.e.,
    s
    x x
    =
    s x 2
    . Because a data set is perfectly correlated with itself, it is always the case that ρ
    xx
    = ρ
    yy
  • Business Statistics For Dummies
    • Alan Anderson(Author)
    • 2023(Publication Date)
    • For Dummies
      (Publisher)
    coefficient of variation (CV) indicates how “spread out” the members of a sample or population are relative to the mean. The coefficient of variation is measured as a percentage, so it’s independent of the units in which the mean and standard deviation are measured. This enables the relative variation of different samples or populations to be compared directly to each other.
    For example, the coefficient of variation can express the risk of an investment portfolio per unit of return. This means you can compare the performance of different portfolios to see which one offers the least amount of risk per unit of return.
    Here’s the formula for finding the coefficient of variation for either samples or populations:
    Suppose a corporation requires the services of a consulting firm to improve its accounting systems. The corporation has determined that the two best choices are Superior Accounting, Inc., and Data Services Corp. The corporation has done some research about the pricing practices of these two firms. The average price charged per hour, along with the standard deviation, are shown in Table 4-8 .
    TABLE 4-8 Comparative Prices Charged by Superior Accounting and Data Services
    Superior Accounting
    Data Services
    Mean price (per hour) $200 $175
    Standard deviation (per hour) $80 $75
    Based on this data, the coefficient of variation for the prices charged by each firm are the following:
    • Superior Accounting:
    • Data Services:
    These results show that although the prices charged by Superior Accounting have a larger standard deviation than Data Services, the relative variation of Data Services is greater (42.86 percent compared with 40.00 percent). This indicates that the relative uncertainty associated with Data Services’ prices is greater than for Superior Accounting’s prices.

    Comparing the relative risks of two portfolios

    Suppose a portfolio manager is responsible for an insurance company’s equity portfolio and bond portfolio. The manager wants to know which portfolio is riskier in absolute and relative terms. The manager takes a sample of returns from the past ten years and computes the mean and standard deviation. See Table 4-9
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.