Mathematics

Sample Mean

The sample mean is the average of a set of numbers calculated by adding up all the values and dividing by the total count of values. It is a measure of central tendency and is often used to estimate the population mean when only a sample is available. The sample mean is a fundamental concept in statistics and is denoted by the symbol "x-bar."

Written by Perlego with AI-assistance

10 Key excerpts on "Sample Mean"

  • Sensory Evaluation of Food
    eBook - ePub

    Sensory Evaluation of Food

    Statistical Methods and Procedures

    • Michael O'Mahony(Author)
    • 2017(Publication Date)
    • CRC Press
      (Publisher)
    2

    Averages, Ranges, and the Nature of Numbers

    2.1 What is the Average? Means, Medians, and Modes

    A sample of data can be difficult to comprehend; it is often difficult to see a trend in a set of numbers. Thus, various procedures are used to aid the understanding of data. Graphs, histograms, and various diagrams can be of help. Furthermore, some middle or average value, a measure of the central tendency of the numbers, is useful. A measure of the spread or dispersion of the numbers is also useful. These measures will now be considered.
    The Mean
    The mean is what is commonly called an average in everyday language. The mean of a sample of numbers is given by the formula
    X ¯
    =
    Σ X
    N
    where ΣX (Σ is the Greek capital letter sigma) denotes the sum of all the X scores, N is the number of scores present, and
    X ¯
    is the common symbol for the mean of a given sample of numbers.
    The mean of the whole population would be calculated in the same way. It is generally denoted by μ . (the Greek lowercase letter mu). Because the mean of a population is usually not possible to obtain, it is estimated. The best estimate of the population mean, μ , is the Sample Mean
    X ¯
    . The mean is the common measure used when inferences are to be drawn from the sample about the population. Strictly, the mean we are discussing is called the arithmetic mean-, there are other means, which are discussed in Section 2.2 .
    For the mean to be used, the spread of the numbers needs to be fairly symmetrical; one very large number can unduly influence the mean so that it ceases to be a central value. When this is the case, the median or the mode can be used.
    The Median
    The median is the middle number of a set of numbers arranged in order. To find a median of several numbers, first arrange them in order and then pick out the one in the middle.
    Example 1 Consider
    1 2 2
    3 ¯
    4 7 7
    Here the number in the middle is 3. That is, the median = 3.
  • Statistics for Economics, Second Edition
    i” represents observations 1 through N. When there is no ambiguity, the index and subscripts are not included:
    The population mean is a parameter and provides information about the central tendency of the data. The mean is susceptible to extreme values. Since all values of the population are used in calculation of the mean, a single very large or very small value can have a major impact on the mean. This is not quite as important in the case of a population as it is with samples.
    Sample Mean
    The Sample Mean is the sum of the sample values divided by the sample size.
    Both, , pronounced mu-hat, and
    X
    , pronounced x-bar, are commonly used in the literature to represent the Sample Mean. Both are widely accepted. However, has several advantages over
    X
    . First it reduces the number of symbols that one has to learn in half. The population parameter is μ and its estimate is . Second, it eliminates guessing which statistic represents a particular population parameter. Third, it provides a reasonably simple rule to follow. Population parameters are represented by Greek letters, and sample statistics are represented by the same Greek letter with a hat on it.
    Definition 2.1 A statistic is a numeric fact or summary obtained from a sample. It is always known, because it is calculated by the researcher, and it is a variable. A statistic is also used to make inferences about the corresponding population parameter.
    Example 2.1 An anthropologist is studying communities of gold miners in a remote area. She selects nine (9) families at random from a community. The family income is reported in $1,000 of dollars below. Find the Sample Mean. These data are hypothetical but plausible.
    66, 58, 71, 73, 64, 70, 66, 55, and 75
    Solution 2.1
    We use these data to show computational detail. A careful reader would remember that the same data were used in Example 1.4 but with a major difference. There, the data were presented as population data. The idea of a small community with nine families is acceptable but a sample of nine is more plausible. We use the same data to avoid wasting time entering new data and we limit data size to avoid tedious computations. The Sample Mean is as follows:
  • Illuminating Statistical Analysis Using Scenarios and Simulations
    • Jeffrey E. Kottemann(Author)
    • 2017(Publication Date)
    • Wiley
      (Publisher)
    Part II Sample Means and the Normal Distribution 18 Scaled Data and Sample Means Part II concerns scaled variables —such as heights, weights, incomes, and prices—that take on integer or real number values. Opinions, too, can be expressed on a scale: 1-to-5, or 1-to-7, or 1-to-100, among other options. 1 The most common summarization of a sample of scaled numbers is the average, what statisticians call the arithmetic mean, or mean for short. The mean is a key sample statistic for scaled variables. To calculate the mean, you simply total up the sample of numbers and divide by the sample size. Below is the Sample Mean calculation in math shorthand. The symbol for the Sample Mean is an with a bar on top, (I will usually spell it out). is the Greek capital letter sigma, which is math shorthand for sum; in this case we are summing a list of numbers. The sum is then divided by via Recall that this same approach to averaging gives us the sample proportions for binomial variables when we represent the individual values with the numbers 0 and 1. Indeed, we will find many parallels between analyzing proportions and analyzing means. The major distinction stems from the following: With binomial variables, proportions are bounded by the values 0 and 1, and proportion variances are bounded by 0 and 0.25. In addition, the proportions themselves dictate proportion variances, with proportion variance equal to. With scaled variables, on the other hand, Sample Means are potentially unbounded and sample variances are bounded only on the low end by 0. Both are uncertain estimates; the mean does not dictate the variance. In Chapter 19, we'll see that Sample Means are—like sample proportions—normally distributed. Chapters 20 and 21 show again that larger sample sizes and lower variances are associated with lower levels of uncertainty
  • Social Statistics
    eBook - ePub

    Social Statistics

    Managing Data, Conducting Analyses, Presenting Results

    • Thomas J. Linneman(Author)
    • 2017(Publication Date)
    • Routledge
      (Publisher)
    Samples with Sample Means a bit away from the population mean—2.2, 2.3, 2.7, 2.8—are still common; however, they are less common than the Sample Means of 2.4, 2.5, and 2.6. The Sample Means of 1.9 and 3.1 are even less likely to happen, but occasionally they did happen. And once, only once, did I draw a sample where the Sample Mean was 3.3. Oddly, it was the second sample I drew. The chances of drawing such a sample are very, very slim. Notice that we didn’t draw any samples for which the Sample Mean was as low as 1.2 or 1.3, or as high as 3.7 or 3.8. Think through why such occurrences did not happen. What would have had to fall into place for this to occur? To draw a sample with a mean of 1.2, I would have had to draw, say, a 1.0, 1.1, 1.2, 1.3, and 1.4. To draw a sample with a mean of 3.8, I would have had to draw, say, a 3.6, 3.7, 3.8, 3.9, and 4.0. Given the population of scores, the chances of such things happening are next to nothing. My hope is that all of this is giving you an intuitive feel for how Sample Means act: most of them will fall near the population mean, some will fall a bit away from the population mean, and a few will, by chance, fall quite far away from the population mean. This is simply how the laws of randomness work. What I have built above is a distribution of Sample Means, or, as it is more commonly called, a sampling distribution. Just as the chi-square distribution was the basis of the chi-square test, sampling distributions form the basis of much of the theory and practice on which mean-based inferential statistical techniques are built. With this sampling distribution, we can calculate the probability of pulling certain types of samples. To find such a probability, we simply take the total number of attempts (in this situation, 300, because we took 300 samples) and divide that into the number of times a particular event occurred
  • Social Statistics
    eBook - ePub

    Social Statistics

    Managing Data, Conducting Analyses, Presenting Results

    • Thomas J. Linneman(Author)
    • 2021(Publication Date)
    • Routledge
      (Publisher)
    Samples with Sample Means a bit away from the population mean—2.2, 2.3, 2.7, 2.8—are still common; however, they are less common than the Sample Means of 2.4, 2.5, and 2.6. The Sample Means of 1.9 and 3.1 are even less likely to happen, but occasionally they did happen. And once, only once, did I draw a sample where the Sample Mean was 3.3. Oddly, it was the second sample I drew. The chances of drawing such a sample are very, very slim. Notice that we didn’t draw any samples for which the Sample Mean was as low as 1.2 or 1.3, or as high as 3.7 or 3.8. Think through why such occurrences did not happen. What would have had to fall into place for this to occur? To draw a sample with a mean of 1.2, I would have had to draw, say, a 1.0, 1.1, 1.2, 1.3, and 1.4. To draw a sample with a mean of 3.8, I would have had to draw, say, a 3.6, 3.7, 3.8, 3.9, and 4.0. Given the population of scores, the chances of such things happening are next to nothing. My hope is that all of this is giving you an intuitive feel for how Sample Means act: most of them will fall near the population mean, some will fall a bit away from the population mean, and a few will, by chance, fall quite far away from the population mean. This is simply how the laws of randomness work. What I have built here is a distribution of Sample Means, or, as it is more commonly called, a sampling distribution. Just as the chi-square distribution was the basis of the chi-square test, sampling distributions form the basis of much of the theory and practice on which mean-based inferential statistical techniques are built. With this sampling distribution, we can calculate the probability of pulling certain types of samples. To find such a probability, we simply take the total number of attempts (in this situation, 300, because we took 300 samples) and divide that into the number of times a particular event occurred
  • Survey Methods in Social Investigation
    • C.A. Moser, G. Kalton(Authors)
    • 2017(Publication Date)
    • Routledge
      (Publisher)
    without replacement : having selected the first member, one selects the second sample member from the remaining three (the first member is not 'replaced in' the population and given a chance to be selected a second time). Here are the six possible samples and the estimate of μ derived from each:
    Possible samples of n = 2 (Ages of members selected) x̄ (i.e. estimate of µ)
    15 and 17 16.0
    15 and 18 16.5
    15 and 22 18.5
    17 and 18 17.5
    17 and 22 19.5
    18 and 22 20.0
    Total 108.0
    If we imagine this process continued indefinitely, each of the above samples will be drawn over and over again. The distribution formed by the values of derived from this infinite number of samples is called the sampling distribution of the mean —a logical enough term since it is a distribution of the means obtained from an infinite number of samples. Now, with simple random sampling, each of the six samples by definition has an equal chance of being selected and, therefore, in the long run occurs an equal number of times; the average of the estimates derived from all the possible samples is then 108/6 = 18, which is equal to the population mean /x.
    The average of the estimates of a population parameter derived from an infinite number of samples is called the expected value of she estimator (to be denoted by m). Notice that here we refer to an estimator rather than an estimate. For a given sample design, the estimator is the method of estimating the population parameter from the sample data; in this case the estimator is the sample arithmetic mean. An estimate is the value obtained by using the method of estimation for a specific sample. If, for the given sample design, the expected value of the estimator is equal to the population parameter —as in this case—the estimator is called unbiased ; if not, it is called biased. 1 The difference between the expected value and the true population value is termed the bias. An example of a biased estimator would be the case where the larger of the two sample members is used to estimate /x. From the six possible samples, it is seen that the expected value of the larger values is (17 + 18 + 22 + 18 + 22 + 22)/6 = 119/6 = 19 . Since n is only 18, this estimator has a bias of 1 . Another example would be the case where the average of the two sample values is the estimator, but where the population member aged 15 can never be found. The possible pairs would then be 17 and 18, 17 and 22, and 18 and 22, giving values of
    x ¯
    = 1 7 5 ,
  • Practical Statistics Simply Explained
    equi-intervalled scale (such as those of weight, length, area, temperature, or time). This is noteworthy because there are some scales in which the steps between successive units are not uniform. The units of such scales therefore behave like arithmetic numbers only in the respect that they both have a conventional order. Take the case of a measurement of human ability like an intelligence test. In such tests there can be no guarantee of a uniform grading of the difficulty of the questions posed, so that while it is valid to say that a person getting a score of 120 is more intelligent than one who has scored 100 at the same test, and that the first score is 20% higher than the other, it is not, however, to be claimed that the first person is 20% more intelligent than the other, for that would only be true if one knew that the units were equally spaced along a scale of increasing difficulty (and that the scale started from an absolute zero). It follows that the arithmetic mean of a group of such units will be erroneous to the extent that the intervals between units are unequal.
    Measures of Dispersion
    The arithmetic mean is a single number which ‘sums up’ a set of numbers by indicating their central tendency or location on a scale. However, it is incomplete as a descriptive measure, because it does not disclose anything about the scatter or dispersion of the values in the set of numbers from which it is derived. In some cases these values will be clustered closely about the arithmetic mean, while in other cases they will be widely scattered.
    The importance of this can be seen from a simple example. Suppose we have tested a sample of 4 television tubes of Make A, and found that the tube life was (in turn) 20, 23, 25, and 26 months. (Tube life is ordinarily quoted as being so many hours; to keep the numbers small we shall stick to months, and assume that all tubes were used for 2 hours daily.) We saw above that the arithmetic mean of these numbers is 23.5. Now if we tested 4 tubes of Make B, and found that these tubes lasted 4, 10, 25, and 55 months, we may be somewhat surprised to discover that the arithmetic mean of these numbers is also 23.5. It is obvious that in such a case we need some way of specifying that the life of Make B
  • Statistical Inference
    eBook - ePub

    Statistical Inference

    A Short Course

    • Michael J. Panik(Author)
    • 2012(Publication Date)
    • Wiley
      (Publisher)
    Fig. 7.1 ).
    Is a “good” estimator for μ? It will be if has certain desirable properties (which we will get to as our discussion progresses). Interestingly enough, these so-called desirable properties are expressed in terms of the mean and variance of the sampling distribution of .
    Let us now turn to the process of constructing the sampling distribution of the mean:
    1. From a population of size N let us take “all possible samples” of size n . Hence we must repeat some (conceptual) random experiment times since there are possible samples that must be extracted.
    2. Calculate for each possible sample: , , . . ., .
    3. Since each Sample Mean is a function of the sample values, different samples of size n will typically display different Sample Means. Let us assume that these differences are due to chance (under simple random sampling). Hence the various means determined in step 2 can be taken to be observations on some random variable , that is, varies under random sampling depending upon which random sample is chosen. Since is a random variable, it has a probability distribution called the sampling distribution of the mean : a distribution showing the probabilities (relative frequencies) of getting different Sample Means from random samples of size n taken from a population of size N .
    Why study the sampling distribution of the mean? What is its role? Basically, it shows how means vary, due to chance, under repeated random sampling from the same population. Remember that inferences about a population are usually made on the basis of a single sample—not all possible samples of a given size. As noted above, the various Sample Means are scattered, due to chance, about the true population mean to be estimated. By studying the sampling distribution of the mean, we can learn something about the error arising when the mean of a single sample is used to estimate the population mean.
  • An Introduction to Statistical Concepts
    • Debbie L. Hahs-Vaughn, Richard Lomax(Authors)
    • 2020(Publication Date)
    • Routledge
      (Publisher)
    We find the mean of this first sample to be 102 pounds and denote it by X ¯ 1 = 102, where the subscript identifies the first sample. This one Sample Mean is known as a point estimate of the population mean, μ, as it is simply one value or point. We can then proceed to collect weight data from a second sample of n females and find that X ¯ 1 = 110. Next we collect weight data from a third sample of n females and find that X ¯ 1 = 119. Imagine that we go on to collect such data from many other samples of size n and compute a Sample Mean for each of those samples. 5.2.2.1 Sampling Distribution of the Mean At this point we have a collection of Sample Means, which we can use to construct a frequency distribution of Sample Means. This frequency distribution is formally known as the sampling distribution of the mean. To better illustrate this new distribution, let us take a very small population from which we can take many samples. Here we define our population of observations as follows: 1, 2, 3, 5, 9 (in other words, we have five values in our population). As the entire population is known here, we can better illustrate the important underlying concepts. We can determine that the population mean μ X = 4 and the population variance σ X 2 = 8, where X indicates the variable we are referring to. Let us first take all possible samples from this population of size 2 (i.e., n = 2) with replacement. As there are only five observations, there will be 25 possible samples, as shown in the upper portion of Table 5.1, called “Samples.” Each entry represents the two observations for a particular sample. For instance, in row 1 and column 4, we see 1,5. This indicates that the first observation is a 1 and the second observation is a 5. If sampling was done without replacement, then the diagonal of the table from upper left to lower right would not exist
  • Statistics for the Behavioural Sciences
    eBook - ePub

    Statistics for the Behavioural Sciences

    An Introduction to Frequentist and Bayesian Approaches

    • Riccardo Russo(Author)
    • 2020(Publication Date)
    • Routledge
      (Publisher)
    .
    In summary, if we want to test hypotheses about means we need to know the characteristics of the distribution of the Sample Means. As seen above, the Central Limit Theorem tells us that the sampling distribution of the mean is usually normal with
    μ
    x ¯
    = μ  and 
    σ
    x ¯
    =
    σ n
    ,
    where µ and σ are the mean and the standard deviation of the parent population of individual observations from which the samples are drawn, and n is the sample size.

    7.3 Testing hypotheses about means when σ is known

    It is usually uncommon to know the standard deviation of a population of scores, so the technique described in this section is of limited application, but it is still worth knowing since there are circumstances in which it can be successfully applied. For example, when a standardised test is applied to a sample of subjects, we then know the population standard deviation and the mean of the individual scores. Let us consider the example described in the Introduction. As stated earlier, we know that the distribution of the population of individual scores in a standardised test measuring reading speed is normal with µ = 200 words per minute and σ
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.