Mathematics

Normal Distribution

Normal distribution is a probability distribution that is symmetric and bell-shaped. It is characterized by its mean and standard deviation, and many natural phenomena follow this distribution. The majority of the data falls within one standard deviation of the mean, and the probability of extreme values decreases rapidly as they move away from the mean.

Written by Perlego with AI-assistance

9 Key excerpts on "Normal Distribution"

eBook - ePub
Painless Statistics
- Patrick Honner(Author)
- 2022(Publication Date)
- Barrons Educational Services
  (Publisher)
Chapter 5 The Normal Distribution
Statistics is an applied science, so once you understand the basics of statistics, you’ll want to apply what you’ve learned. You’ll look around in the world and see data—times, prices, populations, ages, revenues—and you’ll use your knowledge of statistics to make sense of it.

Data in the real world comes in many shapes and many distributions; you were introduced to many different distributions in Chapter 4 . However, when it comes to data in the real world, one of the most common, and useful, shapes is the Normal Distribution.
Normally Distributed Data The Shape of the Normal Distribution
The Normal Distribution is a continuous data distribution that looks like this:

Figure 5–1. The Normal Distribution

As you can see from the above graph, the Normal Distribution is symmetric and unimodal. It’s symmetric because you can draw a line down the middle and see the same shape of data on either side. It’s unimodal because the graph of the data has only one peak, which means there is only one mode.

The Normal Distribution is also known as the Gaussian distribution, after the mathematician Carl Friedrich Gauss.

Many kinds of real-world data, from characteristics like height to performance indicators like test scores, are distributed in a way that is approximately normal. Here’s what a histogram of data that is approximately normal might look like:

Figure 5–2. Histogram of Data that is Approximately Normal

You can see that this histogram shares a similar shape to the graph of the Normal Distribution. It’s possible to find a continuous normal curve that closely fits this data, which means you can approximate this discrete data with a continuous Normal Distribution.

Figure 5–3. Histogram with Approximating Normal Curve
Measures of Central Tendency and Standard Deviation for the Normal Distribution
The ability to approximate data using the Normal Distribution makes it a very powerful and useful statistical tool. When you encounter real-world data that is approximately normal, you can model it with a continuous Normal Distribution and then apply everything you know about the Normal Distribution to your data. Since the Normal Distribution is symmetric and unimodal, it can be understood with just a few numbers. The mean, median, and mode are all the same; that is, a single center of the data splits the data into two parts that are mirror reflections of each other. This is one reason why Normal Distributions are easy to work with.
Sign up to read
Learn more about book
eBook - ePub
Probability, Statistics and Other Frightening Stuff
- Alan Jones(Author)
- 2018(Publication Date)
- Routledge
  (Publisher)
We will see that this is the case with many of the ‘named’ distributions.) In order to differentiate one Normal Distribution from another we need to specify two parameters that define its location along the x-axis and a Scale Parameter that indicates its effective width or dispersion. We express the location in terms of the mean value (µ) and its scale or width by the Standard Deviation (σ).

The Normal Distribution is characterised chiefly by its symmetrical ‘bell-shaped’ Probability Density Function around its mean. Figure 4.5 illustrates two different Normal Distributions. The one on the left has a low Mean value with a small (tight) dispersion around it. The one on the right looks as if someone has pulled it and sat on it, giving it a large (wide) dispersion around a larger Mean.

Figure 4.5
Two Normal Distributions

In this example we should note that the Probability Density of the left-hand distribution’s mode is twice that of the right-hand one, whereas the ‘effective’ width of the one on the right (two to eight) is twice that of the one on the left (one to four). This effective width is equal to six times the Standard Deviation of each distribution. (Note that in an absolute sense both distributions tend towards infinity in either direction.) This leads us to one of the most important properties of a Normal Distribution – the integrity of its relative shape . . .

4.2.2 Key properties of a Normal Distribution
a) Scalable and slidable

As well as being a symmetrical distribution, which implies that its Arithmetic Mean, Mode and Median are all equal, the most important property of a Normal Distribution is that the dispersion of its values around the Mean differs only in direct proportion to its Standard Deviation. For any Normal Distribution, regardless of the values of its Mean or Standard Deviation, we can say that the following will always be true:

There is a 68.27% Confidence Interval formed by a range of one Standard Deviation either side of the Mean (i.e. just over two-thirds of our distribution is packed into the space around the Mean, and just under one third lies outside this range with one sixth on either side being symmetrical
Sign up to read
Learn more about book
eBook - ePub
Interpreting Quantitative Data with IBM SPSS Statistics
- Rachad Antonius(Author)
- 2012(Publication Date)
- SAGE Publications Ltd
  (Publisher)
Figure 6.1 shows the curve of a Normal Distribution.

Figure 6.1 The basic shape of the normal curve

Normal Distributions can be described by the descriptive measures that we have seen so far. They are characterized by the following properties:
1. They are symmetric and unimodal (i.e. they have a single mode), which means that the two halves of the distribution are mirror images of each other and that their mean, mode and median are identical.
2. The graph that represents them is a bell-shaped curve.
3. The distribution can be completely described if we know that it is normal, and if we know its mean and standard deviation. For this reason, Normal Distributions are denoted by the symbols N (μ, σ). The N tells us we are talking about a Normal Distribution, the μ is the mean of the distribution and the σ is its standard deviation.
There is an equation that produces the normal curve. We will not need to use it in this book, but it is interesting to know what it looks like. Here it is:

The equation gives the y-value (the height) of N(0, 1), that is, a normal curve with mean equal to 0 and standard deviation equal to 1, as shown in Figure 6.1 .

Properties of Normal Distributions

Normal Distributions often occur when a quantitative variable is distributed at random. For instance, if we choose a random sample of, say, 3000 men and we draw the distribution of their heights, we are likely to find the pattern of a Normal Distribution shown above.

They can be thought of as a smooth line that runs along the top of a histogram that has a very large number of very narrow columns. Figure 6.2
Sign up to read
Learn more about book
eBook - ePub
Business Statistics For Dummies
- Alan Anderson(Author)
- 2023(Publication Date)
- For Dummies
  (Publisher)
all values between negative infinity and positive infinity.

In the following sections, I show you how you can express the Normal Distribution graphically, I introduce you to the standard Normal Distribution, and I walk you through calculating probabilities for the Normal Distribution.

Graphing the Normal Distribution

The Normal Distribution can be graphed with a special type of curve, which is usually described as a bell-shaped curve. Normal probabilities can be determined by computing areas under this curve.

The bell-shaped curve has several key features. It’s defined over the entire range of values between negative and positive infinity; it’s symmetrical about the mean (for example, the area below the mean is a mirror image of the area above the mean); and most of the area under the Normal Distribution is close to the mean. The area declines rapidly for values that are several standard deviations away from the mean. As an example, the distribution of heights from the previous example is illustrated with a bell-shaped curve in Figure 9-1 .

The mean of 69 inches is at the center of the distribution; the area to the left of the mean is a mirror image of the area to the right of the mean. Most of the area under the curve is close to the mean; the area falls off rapidly for large and small values of X. (The extreme right and left ends of the curve are known as the tails of the distribution.) Figure 9-2 shows that the probability of a randomly chosen man’s height being between 67 inches and 71 inches is 68.27 percent.

The shaded region under the curve represents heights between 67 and 71 inches. This covers 68.27 percent of the area under the curve; therefore, the probability that a randomly chosen man’s height is between 67 inches and 71 inches is 0.6827 or 68.27 percent.
Sign up to read
Learn more about book
eBook - ePub
Reasoning About Luck
- Vinay Ambegaokar(Author)
- 2017(Publication Date)
- Dover Publications
  (Publisher)
The curve is bell-shaped, the area under it is unity, the peak is at x = μ, and the standard deviation – as generalized in a natural way to apply to a continuous curve – is σ. There is a simple mathematical formula for the curve, but it is premature to write it out now. Instead, in Fig. 3.3, this ‘function of x ’, to use the mathematical term, is plotted for μ = 40 and σ 2 = 20. The histogram on the same plot gives the probabilities for various numbers of heads on 80 tosses of a fair coin, i.e., a binomial distribution with, according to what we have just learned, a mean of 80 × (1/2) = 40 and a variance of 80 × (1/2) × [l – (1/2)] = 20. You will notice that the curve at integer values of x is very closely equal to the height of the corresponding column of the histogram. Although there are tiny differences, too small for the eye to detect, in the far wings, the message of the figure is that the binomial distribution for a large number of trials is extremely well approximated by a normal curve with the same mean and standard deviation. This is the root cause of the similar shape of the last three distributions plotted in Fig. 3.2 : they are each, to all intents and purposes, Normal Distributions differing only in their means and standard deviations. Fig. 3.3. Binomial and Normal Distributions, as described in the text. The Normal Distribution is so well studied and occurs so frequently that its properties are to be found in even rather small volumes of collected mathematical tables. One finds, for example, that a region of 2.6 standard deviations on either side of the mean contains 99.07% of the area under the normal curve
Sign up to read
Learn more about book
eBook - ePub
Introduction to Statistics for Forensic Scientists
- David Lucy(Author)
- 2013(Publication Date)
- Wiley
  (Publisher)
4 The Normal Distribution In Section 3.2 we saw how the binomial distribution could be used to calculate probabilities for specific outcomes for runs of events based upon either a known probability, or an observed probability, for a single event. We also saw how an empirical probability distribution can be treated in exactly the same way as a modelled distribution. Both these distributions were for discrete data types, or for continuous types made into discrete data. In this section we deal with the Normal Distribution, which is a probability distribution applied to continuous data. 4.1 The Normal Distribution The Normal Distribution † is possibly the most commonly used continuous distribution in statistical science. This is because it is a theoretically appealing model to explain many forms of natural continuous variation. Many of the discrete distributions may be approximated by the Normal Distribution for large samples. Most continuous variables, particularly from biological sciences, are distributed normally, or can be transformed to a Normal Distribution. Imagine a continuous random variable such as the length of the femur in adult humans. The mean length of this bone is about 400 mm, some are 450 mm and some are 350 mm, but there are not many in either of these categories. If the distribution is plotted then we expect to see a shape with its maximum height at about 400 mm tailing off to either side. These shapes have been plotted for both the adult human femur and adult human tibia in Figure 4.1. The tibia in any individual is usually shorter than the femur, however, Figure 4.1 tells us that some people have tibias which are longer than other people’s femurs. Notice how the mean of tibia measurements is shorter than the mean of the femur measurements
Sign up to read
Learn more about book
eBook - ePub
Measurement, Data Analysis, and Sensor Fundamentals for Engineering and Science
- Patrick F. Dunn(Author)
- 2019(Publication Date)
- CRC Press
  (Publisher)
2 , are examined first. These distributions can be used to determine the probabilities of events and various statistical quantities. Statistical inference is utilized to estimate the characteristics of a population from finite information. These tools help to interpret correctly the results of experiments.

12.2 Normal Distribution

Now, consider the Normal Distribution in more detail. In the limit when N becomes very large and Pr is finite, assuming that the variance remains constant, the binomial probability density function becomes the normal probability density function.

Consider a random error to be comprised of a large number of N elementary errors of equal and infinitesimally small magnitude, e, with an equally likely chance of being either positive or negative, where P = 1/2. The Normal Distribution allows us to find the probability of occurrence of any error in the range from –Ne to +Ne , where the probability density function is

p ( x ) =
1
2 π N P ( 1 − P )
exp ⁡ [
− ( x − N P
) 2
2 N P ( 1 − P )
] . ⁢
(12.1)

The mean and variance are the same as the binomial distribution, NP and NPQ , respectively, where Q = 1 – P . The higher-order central moments of the skewness and kurtosis are 0 and 3, respectively.

Utilizing expressions for the mean, x' , and the variance, <r2 , in Equation 12.1 , the probability density function assumes the more familiar form

p ( x ) =
1
σ
2 π
exp ⁡ [ −
1
2
σ 2
( x −
x ′
) 2
] . ⁢
(12.2)

The normal probability density function is shown in the left plot in Figure 12.1 , in which p(x) is plotted versus the nondimensional variable z = (x – x')/a . Its maximum value equals 0.3989 at z = 0.

The normal probability density function is very significant. Many probability density functions tend to the normal probability density function when the sample size is large. This is supported by the central limit and related theorems. The central limit theorem can be described loosely [5 ]. Given a population of values with finite variance, if independent samples are taken from this population, all of size N , then the new population formed by the averages of these samples will tend to be governed by the normal probability density function, regardless of what distribution governed the original population. Alternatively, the central limit theorem states that whatever the distribution of the independent variables, subject to certain conditions, the probability density function of their sum approaches the normal probability density function (with a mean equal to the sum of their means and a variance equal to the sum of their variances) as N approaches infinity. The conditions are that (1) the variables are expressed in a standardized, nondimensional format, (2) no single variate dominates, and (3) the sum of the variances tends to infinity as N
Sign up to read
Learn more about book
eBook - ePub
Understanding Statistics
- Bruce J. Chalmer(Author)
- 2020(Publication Date)
- CRC Press
  (Publisher)
Chapter 3 we noted that the mean and standard deviation completely specify a Normal Distribution. That is, once you know the mean and standard deviation of a distribution known to be normal in shape, you can say exactly what proportion of scores in the distribution are in any given range. Let’s consider how this is done.

First, it is handy to consider some general characteristics. (In fact, you will find it convenient to memorize these characteristics of a Normal Distribution, since you will be using them very frequently.) Refer to Figure 4.1 .

1. A Normal Distribution is symmetric; therefore, it is centered about its mean (and, of course, its mean and median are equal). 2. About 68%—a little over two-thirds—of the scores are within 1 standard deviation of the mean. 3. About 95% of the scores are within 2 standard deviations of the mean. 4. Nearly all the scores in a Normal Distribution are within 3 standard deviations of the mean.

Item 3 is especially handy: the mean ±2 standard deviations includes about 95% of the scores in a Normal Distribution. For example, if a Normal Distribution Figure 4.2 Normal Distribution with mean = 37, standard deviation = 4. has a mean of 37 and a standard deviation of 4, we can say that 95% of the scores are between 29 and 45 (see Figure 4.2 ).

Figure 4.1 Areas in a Normal Distribution.

Figure 4.2 Normal Distribution with mean = 37, standard deviation = 4.

Drawing a picture

Now, let’s get more specific. In our Normal Distribution with mean 37 and standard deviation 4, what proportion of scores are between 37 and 39? Or between 38 and 42.86? How do we figure that out? There are two ways to do it. One way is to let a computer figure it out for you; the other is to use a table of the standard Normal Distribution. Since it is vital to understand how the Normal Distribution works even if a computer does carry out the calculation, we will cover the second method.

There are three rules for using a table of the standard Normal Distribution: (1) draw a picture, (2) draw a picture, and (3) draw a picture. What picture should you draw? A histogram of a Normal Distribution, of course. As we have already seen, the proportion of scores in any particular range is represented by the area in the histogram above that range. So finding proportions in the distribution is the same as finding areas in the histogram. The first thing to do when you want to find a proportion in some region of a Normal Distribution is draw a picture of the distribution and shade in the region in which you are interested
Sign up to read
Learn more about book
eBook - ePub
Researching Education
Perspectives and Techniques
- Kanka Mallick, Gajendra Verma(Authors)
- 2005(Publication Date)
- Routledge
  (Publisher)
It can be seen that the tops of the columns lie approximately on a curve and that the area under the curve is equivalent to that of the columns. However, to obtain the probability of 8 or more heads by using the curve we need to obtain the area under the tail of the curve from 7.5 to 10.5. Since tables of areas under the normal curve are obtainable in books, they are often used to obtain probabilities in large distributions.

Figure 8.2: Normal Distribution and the 10-coin test

Although laborious, it would be possible to work out the probabilities if 100 coins were tossed together. If the result were represented in a column graph, the tops of the columns would form an almost perfect curve known as the ‘normal curve’.

Many distributions of measurements of human beings such as heights, weights, sizes of shoes, gloves or hats, the distances individuals of the same age and sex can throw a ball, and so on, fit very well to the normal curve. So do many educational and psychological test scores, although in the case of standardized tests, this may be partly because educationists and psychologists designed them to do so. Before we can make use of the normal curve in order to say how exceptional a pupil’s score is as compared with other children of the same age, we need to turn his or her score in a test into a standard score. To do this we must first obtain the mean and standard deviation for the distribution of scores.

Taking a simplified example initially: in six tests, a boy obtains marks of 36, 49, 52, 60, 65, 74. The mean (or average) score is defined as the total of the scores for all the tests divided by the number of tests, i.e. the boy’s mean score is .
Deviations of the scores above and below the mean are as follows:

Score 52 49 65 36 60 74

Deviation from 56 -4 -7 9 -20 4 18 Total=0

Having obtained the variance by squaring the deviation (which gave us 147.67 in this example) the ‘standard deviation’ is arrived at by finding the square root of 147.67 which is approximately 12.2. It is now possible to arrive at the ‘standard score’ in each subject by using the formula
Sign up to read
Learn more about book

Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.

Normal Distribution

9 Key excerpts on "Normal Distribution"

Painless Statistics

Probability, Statistics and Other Frightening Stuff

4.2.2 Key properties of a Normal Distribution

a) Scalable and slidable

Interpreting Quantitative Data with IBM SPSS Statistics

Properties of Normal Distributions

Business Statistics For Dummies

Graphing the Normal Distribution

Reasoning About Luck

Introduction to Statistics for Forensic Scientists

Measurement, Data Analysis, and Sensor Fundamentals for Engineering and Science

12.2 Normal Distribution

Understanding Statistics

Drawing a picture

Researching Education

Perspectives and Techniques

Score	52	49	65	36	60	74
Deviation from	56	-4	-7	9	-20	4	18	Total=0