Mathematics

Standard Normal Distribution

The standard normal distribution is a probability distribution that has a mean of zero and a standard deviation of one. It is a bell-shaped curve that is symmetric around the mean and is widely used in statistical analysis to model random variables that have a normal distribution.

Written by Perlego with AI-assistance

10 Key excerpts on "Standard Normal Distribution"

  • Business Statistics For Dummies
    • Alan Anderson(Author)
    • 2023(Publication Date)
    • For Dummies
      (Publisher)
    In many business applications, variables are assumed to be normally distributed. For example, returns to stocks are often assumed to be normally distributed by investors, portfolio managers, financial analysts, risk managers, and so on. The assumption of normality is not only convenient, but many standard statistical techniques require it in order to generate valid results. For example, computing a confidence interval for the mean of a population may be based on the normal distribution. Many of the techniques used in regression analysis to check the validity of the results are based on the normal distribution. As a result, even when the assumption of normality is not perfectly accurate, the normal distribution is often used to perform statistical analyses due to its convenience.

    Getting to know the Standard Normal Distribution

    The Standard Normal Distribution is the special case where μ = 0 and σ = 1. For example, suppose that the daily returns to a stock follow the Standard Normal Distribution. The mean return over a single trading day is 0 percent, and the standard deviation is 1 percent; as a result:
    • The probability that tomorrow’s return will be between −1 percent and +1 percent is 0.6827 or 68.27 percent. −1 percent represents one standard deviation below the mean, while +1 percent represents one standard deviation above the mean.
    • The probability that tomorrow’s return will be between −2 percent and +2 percent is 0.9544 or 95.44 percent. −2 percent represents two standard deviations below the mean, while +2 percent represents two standard deviations above the mean.
    • The probability that tomorrow’s return will be between −3 percent and +3 percent is 0.9973 or 99.73 percent. −3 percent represents three standard deviations below the mean, while +3 percent represents three standard deviations above the mean.
    By convention, the letter Z represents a standard normal random variable, whereas the letter X represents any other normal random variable.

    Computing standard normal probabilities

    One approach to computing probabilities for the Standard Normal Distribution is to use statistical tables. (For the mathematically inclined, the tables result from applying calculus to the normal distribution.) The standard normal table is designed to show cumulative probabilities; i.e., the probability that a standard normal random variable Z is less than or equal to a specified value, such as P(Z ≤ 2.50). Standard normal tables are divided into two parts; the first shows positive values for Z , and the second shows negative values for Z
  • Understanding Educational Statistics Using Microsoft Excel and SPSS
    • Martin Lee Abbott(Author)
    • 2014(Publication Date)
    • Wiley
      (Publisher)
    N, of whatever size, will always equal 0. Therefore the mean of a perfect, Standard Normal Distribution is equal to 0.
    The Standard Normal Distribution has a standard deviation equal to 1 unit. This is simply an easy way to designate the known areas under the curve. Figure 7.1 shows that there are six standard deviation units that capture almost all the cases under the perfect normal curve area. (This is the source of the rule for the range equaling six times the SD in a raw score distribution.) This is how the standard normal curve is ‘‘arranged’’ mathematically. So, for example, 13.59% of the area of the curve lies between the first (+1) and second (+2) standard deviation on the right side of the mean. Because the curve is symmetrical, there is also 13.59% of the area of the curve between the first (−1) and second (−2) standard deviation on the left side of the curve, and so on.
    FIGURE 7.1
    The normal curve with known properties.
    Remember that this is an ideal distribution. As such, we can compare our actual data distributions to it as a way of understanding our own raw data better. Also, we can use it to compare two sets of raw score data since we have a perfect measuring stick that relates to both sets of ‘‘imperfect’’ data.
    There are other features of the standard distribution we should notice.
    • The scores cluster in the middle, and they ‘‘thin out’’ toward either end.
    • It is a balanced or symmetrical distribution, with equal numbers of scores on either side of the middle.
    • The mean, median, and mode all fall on the same point.
    • The curve is ‘‘asymptotic’’ to the x axis. This means that it gets closer and closer to the x axis but never touches because, in theory, there may be a case very far from the other scores—off the chart, so to speak. There has to be room under the curve for these kinds of possibilities.
    • The inflection point of the standard normal curve is at the point of the (nega­tive and positive) first standard deviation unit. This point is where the steep decline of the curve slows down and widens out. (This is a helpful visual cue to an advanced procedure called factor analysis, which uses a scree plot
  • Interpreting Quantitative Data with IBM SPSS Statistics
    Figure 6.1 shows the curve of a normal distribution.
    Figure 6.1 The basic shape of the normal curve
    Normal distributions can be described by the descriptive measures that we have seen so far. They are characterized by the following properties:
    1. They are symmetric and unimodal (i.e. they have a single mode), which means that the two halves of the distribution are mirror images of each other and that their mean, mode and median are identical.
    2. The graph that represents them is a bell-shaped curve.
    3. The distribution can be completely described if we know that it is normal, and if we know its mean and standard deviation. For this reason, normal distributions are denoted by the symbols N (μ, σ). The N tells us we are talking about a normal distribution, the μ is the mean of the distribution and the σ is its standard deviation.
    There is an equation that produces the normal curve. We will not need to use it in this book, but it is interesting to know what it looks like. Here it is:
    The equation gives the y-value (the height) of N(0, 1), that is, a normal curve with mean equal to 0 and standard deviation equal to 1, as shown in Figure 6.1 .

    Properties of Normal Distributions

    Normal distributions often occur when a quantitative variable is distributed at random. For instance, if we choose a random sample of, say, 3000 men and we draw the distribution of their heights, we are likely to find the pattern of a normal distribution shown above.
    They can be thought of as a smooth line that runs along the top of a histogram that has a very large number of very narrow columns. Figure 6.2
  • Statistics for the Behavioural Sciences
    eBook - ePub
    σ = 15. The percentages indicate the portion of the total area under the normal distribution between two IQ values (e.g., 13.6 per cent of the total area under the normal distribution is between the IQ values of 70 and 85, inclusive).
    While the distribution is always bell shaped and symmetrical around the mean, different values of the parameters affect the appearance of the distribution. Changes in μ produce shifts of the entire distribution along the x-axis (i.e., the distribution is centred around different mean values), while changes in σ2 , or equivalently in σ, correspond to changes in the scale of measurement and affect the peak of the distribution and its spread around the mean. With small standard deviations, the peak of the distribution tends to be high and the main body of the bell tends to be narrow, while with large standard deviations, the peak tends to be low and the main body of the bell tends to be wide.
    We said earlier that areas under a continuous distribution, between two values of the random variable X, correspond to probabilities. We also gave the size of the areas of some portions of the normal curve shown in Figure 5.3. Any area under a curve is obtained by calculating the integral, over a range of values of the random variable X, of the function describing the continuous distribution. However, you do not have to worry about the process of integration to calculate probabilities for the most important continuous distributions, because these probabilities are provided in tables. In the case of the normal distribution, a table of probabilities is available for the special case where μ = 0 and σ2 = 1 (see Table 5.1 and the more comprehensive Z table in the Appendix). A normal distribution with these parameters is called the Standard Normal Distribution. The next section will describe this distribution.

    The Standard Normal Distribution

    In the last section of Chapter 2 we presented some transformations that can be applied to a set of data. In particular we said that when the mean is subtracted from each score and this difference is divided by the standard deviation of the distribution of the scores, we obtain a new standardised score named z
  • Painless Statistics
    Chapter 5 The Normal Distribution
    Statistics is an applied science, so once you understand the basics of statistics, you’ll want to apply what you’ve learned. You’ll look around in the world and see data—times, prices, populations, ages, revenues—and you’ll use your knowledge of statistics to make sense of it.
    Data in the real world comes in many shapes and many distributions; you were introduced to many different distributions in Chapter 4 . However, when it comes to data in the real world, one of the most common, and useful, shapes is the normal distribution.
    Normally Distributed Data The Shape of the Normal Distribution
    The normal distribution is a continuous data distribution that looks like this:
    Figure 5–1. The Normal Distribution
    As you can see from the above graph, the normal distribution is symmetric and unimodal. It’s symmetric because you can draw a line down the middle and see the same shape of data on either side. It’s unimodal because the graph of the data has only one peak, which means there is only one mode.
    The normal distribution is also known as the Gaussian distribution, after the mathematician Carl Friedrich Gauss.
    Many kinds of real-world data, from characteristics like height to performance indicators like test scores, are distributed in a way that is approximately normal. Here’s what a histogram of data that is approximately normal might look like:
    Figure 5–2. Histogram of Data that is Approximately Normal
    You can see that this histogram shares a similar shape to the graph of the normal distribution. It’s possible to find a continuous normal curve that closely fits this data, which means you can approximate this discrete data with a continuous normal distribution.
    Figure 5–3. Histogram with Approximating Normal Curve
    Measures of Central Tendency and Standard Deviation for the Normal Distribution
    The ability to approximate data using the normal distribution makes it a very powerful and useful statistical tool. When you encounter real-world data that is approximately normal, you can model it with a continuous normal distribution and then apply everything you know about the normal distribution to your data. Since the normal distribution is symmetric and unimodal, it can be understood with just a few numbers. The mean, median, and mode are all the same; that is, a single center of the data splits the data into two parts that are mirror reflections of each other. This is one reason why normal distributions are easy to work with.
  • Reasoning About Luck
    Fig. 3.3. Binomial and normal distributions, as described in the text.
    The normal distribution is so well studied and occurs so frequently that its properties are to be found in even rather small volumes of collected mathematical tables. One finds, for example, that a region of 2.6 standard deviations on either side of the mean contains 99.07% of the area under the normal curve. Since the area is the natural generalization of the sum of columns in a histogram, the reason for the ‘universality’ – to use a word quite fashionable among physicists today – of the connection between the standard deviation and the 99% width has been uncovered.
    The point is that after a moderate number of trials of a two-outcome random experiment, or indeed any reasonable random experiment, an approximation to the normal distribution emerges with the same mean and standard deviation as the N -trial distribution. The rule of thumb ‘2.6 standard deviations on either side of the mean’ thus applies generally as an estimate of the 99% probable range after N trials, when N is large enough for the distribution to be bell-shaped. A table of the normal distribution also reveals that 2 standard deviations on either side of the mean contains 95.45% of the area. Thus, 2 standard deviations is a good estimate for the 95% probable width. Similarly, one sees from Table 3.5 that 1 standard deviation on either side of the mean contains approximately 68% of the probability. [This width, 1 standard deviation on either side of the mean, is indicated by the horizontal line in the upper region of Fig. 3.3 .]
    It is important to re-emphasize that the normal distribution only applies after a reasonably large number of independent trials. For a small number of trials, the question of how likely or unlikely is a given collection of outcomes can only be answered by examining the distribution particular to the case in question.
  • Introduction to Statistics for Forensic Scientists
    • David Lucy(Author)
    • 2013(Publication Date)
    • Wiley
      (Publisher)
    This was done by summing the probabilities of finding 0 or 1 or 2 or… 5 males from the same sample. Very much the same thing can be done with the normal distribution, only the summation has to be calculated using a mathematical process called integration. As integration is a difficult process statisticians have calculated tables for a standardized normal distribution which can be rescaled to fit any particular normal distribution. This result of this standardization is called the Standard Normal Distribution, and it has a mean of 0 and standard deviation of 1. Figure 4.3 shows the Standard Normal Distribution, and Appendix C is such a table of summed areas at each point for the Standard Normal Distribution. The shaded area under the standard normal curve extends from −∞ § to two standard deviations. In fact it is the same diagram as in the top right corner of Appendix C. If we wish to find the area under this portion of the curve we simply look down the rows of Appendix C until we reach the row labelled z = 2.0 ¶. For the third decimal place the appropriate column is selected. In the case of 2 standard deviations it is the first column, which has the value 0.9772. As the Standard Normal Distribution is a probability distribution the total area under the curve must equal 1, so the value 0.9772 means that 97.72% of the total area, hence probability for the distribution, lies between −∞ and 2 standard deviations. Figure 4.3 Standard Normal Distribution with mean 0 and standard deviation equal to 1. The shaded area covers the range −∞ to 2 standard deviations Figure 4.4 is the same distribution shown in Figure 4.3 but rescaled to the normal distribution underlying the Δ 9 -THC content sample from 1986. In this case the mean is 8.59% and the standard deviation 1.09. The shaded area upper limit is at the mean plus twice the standard deviation which is 8.59 + (2 × 1.09) = 8.59 + 2.16 = 10.75
  • Understanding Statistics
    • Bruce J. Chalmer(Author)
    • 2020(Publication Date)
    • CRC Press
      (Publisher)
    Chapter 3 we noted that the mean and standard deviation completely specify a normal distribution. That is, once you know the mean and standard deviation of a distribution known to be normal in shape, you can say exactly what proportion of scores in the distribution are in any given range. Let’s consider how this is done.
    First, it is handy to consider some general characteristics. (In fact, you will find it convenient to memorize these characteristics of a normal distribution, since you will be using them very frequently.) Refer to Figure 4.1 .
    1.  A normal distribution is symmetric; therefore, it is centered about its mean (and, of course, its mean and median are equal). 2.  About 68%—a little over two-thirds—of the scores are within 1 standard deviation of the mean. 3.  About 95% of the scores are within 2 standard deviations of the mean. 4.  Nearly all the scores in a normal distribution are within 3 standard deviations of the mean.
    Item 3 is especially handy: the mean ±2 standard deviations includes about 95% of the scores in a normal distribution. For example, if a normal distribution Figure 4.2 Normal distribution with mean = 37, standard deviation = 4. has a mean of 37 and a standard deviation of 4, we can say that 95% of the scores are between 29 and 45 (see Figure 4.2 ).
    Figure 4.1 Areas in a normal distribution.
    Figure 4.2 Normal distribution with mean = 37, standard deviation = 4.

    Drawing a picture

    Now, let’s get more specific. In our normal distribution with mean 37 and standard deviation 4, what proportion of scores are between 37 and 39? Or between 38 and 42.86? How do we figure that out? There are two ways to do it. One way is to let a computer figure it out for you; the other is to use a table of the Standard Normal Distribution. Since it is vital to understand how the normal distribution works even if a computer does carry out the calculation, we will cover the second method.
    There are three rules for using a table of the Standard Normal Distribution: (1) draw a picture, (2) draw a picture, and (3) draw a picture. What picture should you draw? A histogram of a normal distribution, of course. As we have already seen, the proportion of scores in any particular range is represented by the area in the histogram above that range. So finding proportions in the distribution is the same as finding areas in the histogram. The first thing to do when you want to find a proportion in some region of a normal distribution is draw a picture of the distribution and shade in the region in which you are interested
  • Researching Education
    eBook - ePub

    Researching Education

    Perspectives and Techniques

    • Kanka Mallick, Gajendra Verma(Authors)
    • 2005(Publication Date)
    • Routledge
      (Publisher)
    It can be seen that the tops of the columns lie approximately on a curve and that the area under the curve is equivalent to that of the columns. However, to obtain the probability of 8 or more heads by using the curve we need to obtain the area under the tail of the curve from 7.5 to 10.5. Since tables of areas under the normal curve are obtainable in books, they are often used to obtain probabilities in large distributions.
    Figure 8.2: Normal distribution and the 10-coin test
    Although laborious, it would be possible to work out the probabilities if 100 coins were tossed together. If the result were represented in a column graph, the tops of the columns would form an almost perfect curve known as the ‘normal curve’.
    Many distributions of measurements of human beings such as heights, weights, sizes of shoes, gloves or hats, the distances individuals of the same age and sex can throw a ball, and so on, fit very well to the normal curve. So do many educational and psychological test scores, although in the case of standardized tests, this may be partly because educationists and psychologists designed them to do so. Before we can make use of the normal curve in order to say how exceptional a pupil’s score is as compared with other children of the same age, we need to turn his or her score in a test into a standard score. To do this we must first obtain the mean and standard deviation for the distribution of scores.
    Taking a simplified example initially: in six tests, a boy obtains marks of 36, 49, 52, 60, 65, 74. The mean (or average) score is defined as the total of the scores for all the tests divided by the number of tests, i.e. the boy’s mean score is .
    Deviations of the scores above and below the mean are as follows:
    Score 52 49 65 36 60 74
    Deviation from 56 -4 -7 9 -20 4 18 Total=0
    Having obtained the variance by squaring the deviation (which gave us 147.67 in this example) the ‘standard deviation’ is arrived at by finding the square root of 147.67 which is approximately 12.2. It is now possible to arrive at the ‘standard score’ in each subject by using the formula
  • Measurement, Data Analysis, and Sensor Fundamentals for Engineering and Science
    • Patrick F. Dunn(Author)
    • 2019(Publication Date)
    • CRC Press
      (Publisher)
    2 , are examined first. These distributions can be used to determine the probabilities of events and various statistical quantities. Statistical inference is utilized to estimate the characteristics of a population from finite information. These tools help to interpret correctly the results of experiments.

    12.2    Normal Distribution

    Now, consider the normal distribution in more detail. In the limit when N becomes very large and Pr is finite, assuming that the variance remains constant, the binomial probability density function becomes the normal probability density function.
    Consider a random error to be comprised of a large number of N elementary errors of equal and infinitesimally small magnitude, e, with an equally likely chance of being either positive or negative, where P = 1/2. The normal distribution allows us to find the probability of occurrence of any error in the range from –Ne to +Ne , where the probability density function is
    p ( x ) =
    1
    2 π N P ( 1 P )
    exp [
    ( x N P
    ) 2
    2 N P ( 1 P )
    ] .
    (12.1)
    The mean and variance are the same as the binomial distribution, NP and NPQ , respectively, where Q = 1 – P . The higher-order central moments of the skewness and kurtosis are 0 and 3, respectively.
    Utilizing expressions for the mean, x' , and the variance, <r2 , in Equation 12.1 , the probability density function assumes the more familiar form
    p ( x ) =
    1
    σ
    2 π
    exp [
    1
    2
    σ 2
    ( x
    x
    ) 2
    ] .
    (12.2)
    The normal probability density function is shown in the left plot in Figure 12.1 , in which p(x) is plotted versus the nondimensional variable z = (x – x')/a . Its maximum value equals 0.3989 at z = 0.
    The normal probability density function is very significant. Many probability density functions tend to the normal probability density function when the sample size is large. This is supported by the central limit and related theorems. The central limit theorem can be described loosely [5 ]. Given a population of values with finite variance, if independent samples are taken from this population, all of size N , then the new population formed by the averages of these samples will tend to be governed by the normal probability density function, regardless of what distribution governed the original population. Alternatively, the central limit theorem states that whatever the distribution of the independent variables, subject to certain conditions, the probability density function of their sum approaches the normal probability density function (with a mean equal to the sum of their means and a variance equal to the sum of their variances) as N approaches infinity. The conditions are that (1) the variables are expressed in a standardized, nondimensional format, (2) no single variate dominates, and (3) the sum of the variances tends to infinity as N
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.