Mathematics

Mean Median and Mode

Mean, median, and mode are measures of central tendency used in statistics. The mean is the average of a set of numbers, calculated by adding them together and dividing by the total count. The median is the middle value when the numbers are arranged in ascending order. The mode is the number that appears most frequently in the set.

Written by Perlego with AI-assistance

10 Key excerpts on "Mean Median and Mode"

  • Sensory Evaluation of Food
    eBook - ePub

    Sensory Evaluation of Food

    Statistical Methods and Procedures

    • Michael O'Mahony(Author)
    • 2017(Publication Date)
    • CRC Press
      (Publisher)
    For the mean to be used, the spread of the numbers needs to be fairly symmetrical; one very large number can unduly influence the mean so that it ceases to be a central value. When this is the case, the median or the mode can be used.
    The Median
    The median is the middle number of a set of numbers arranged in order. To find a median of several numbers, first arrange them in order and then pick out the one in the middle.
    Example 1 Consider
    1 2 2
    3 ¯
    4 7 7
    Here the number in the middle is 3. That is, the median = 3. Example 2 Consider
    1 2 3 : 3 3 4
    Here there is no one middle number, there are two: 3 and 3. The median in this case is the number halfway between the two middle numbers. So the median = 3. Example 3 Consider
    1 2 2 : 3 8 90
    Again, there are two middle numbers: 2 and 3. The number halfway between these is 2.5. So the median = 2.5. Note how the extreme value of 90 did not raise the median the way it would have the mean.
    The Mode
    The mode is the most frequently occurring value
    Example 4 Find the mode of
    1 ,
    1 ,
    2 ,
    2 ,
    3 ,
    3 ,
    3 ,
    4 ,
    5 ,
    6 ,
    7 ¯
    ,
    7 ¯
    ,
    7 ¯
    ,
    7 ¯
    ,
    8 ,
    8
    The most commonly occurring number is 7; it occurs four times. So the mode is 7.
    Of course, it is possible for a set of numbers to have more than one mode, or if no number occurs more than any other, no mode at all. A distribution with two modes is said to be bimodai, three modes, trimodal\ and so on.

    2.2 Which Measure of Central Tendency Should you Use?

    The mean is the most useful statistic when inferences are to be drawn about a population from a sample; the mean of a sample is the best estimate of the mean of the population. It is commonly used by statisticians but is affected unduly by extreme values. The mode is easily computed and interpreted (except where there are several possible modes) but is rarely employed; no common inferential procedure makes use of the mode. The median is a useful descriptive statistic which fits the requirements of clear and effective communication, but it is of only limited use where inferences about population parameters are to be made from a sample. It is less affected by a preponderance of extreme values on one side of a distribution (skew) than the mean, and is to be preferred to the mean in cases where there are extreme values or values of indeterminable magnitude (e.g., when values are too big to be measured on the scale being used). The mean uses more information than either of the other measures, shows less fluctuation over successive samples, and is employed in all classical (parametric: see Section 2.7
  • Data Analytics
    eBook - ePub

    Data Analytics

    Systems Engineering - Cybersecurity - Project Management

    This number represents the median of this dataset. To show you the difference between the median and the mean, the mean of the dataset is 135,000. This is quite a difference from 100,000 and shows how much just a few numbers on either side of the dataset can “draw” the average to one side or the other. In this case, the 300,000 drew the average to the right of the median. Later on, we focus on how the dataset is “skewed” to the right if the mean is to the right of the median. This means that the dataset is not symmetrical and may not relate to a “normal” or standard distribution. More on this later. 3.3 Mode The mode is probably one of the most misunderstood forms of the measures of the central tendency. In my many years of statistics instruction, the mode always seems to take a back seat to the more complicated methods that exist and there is no reason to ever dismiss the mode, specifically when trying to find a pattern of events. For instance, if a government agency wanted to know when most of the customers came into a field office, it could track that number over a period of weeks, months, years, or decades and see when the curve showed humps. Those humps represent when the most customers visited the field office. In other words, the mode is a value that represents the most prevalent value in that dataset. There is no magical formula, and in most instances, this is much easier to do with a computer than by hand, but it is just an important (if not more important) than many other measures. The best way to illustrate this is not through formulas and calculation, but looking at visual representations of the data. Figure 3.1 shows a graph that tracks customer data, with the x-axis being time and the y-axis being the number of customers
  • Probability and Statistics for Engineering and the Sciences with Modeling using R
    To determine the median value in a sequence of numbers, the numbers must first be arranged in value order from lowest to highest. If there is an odd number of numbers, the median value is the number that is in the middle, with the same number of numbers below and above. If there is an even number of numbers in the list, the middle pair must be determined, added together, and divided by two to find the median value. The median does not use all the data, as does the mean, but may be preferred if there are outliers. For symmetric data, the two measures are the same. The trimmed mean is the average of the “middle values,” trimming off some percentage of the very small and very large numbers. It is “between” the median and mean in terms of how much data is used and the impact of outliers. Mode refers to the most frequently occurring number found in a set of numbers. The mode is generally useful in describing where the graphical “peak” of a data set occurs or whether there is more than one peak. The mode is also the only measure that can be used with qualitative data. We can also calculate various measures of spread for a set of data. The definition of “variance”: A measure of the dispersion of a set of data points around their mean value. Variance is a mathematical expectation of the average squared deviations from the mean. The square root of the variance is the definition of “standard deviation,” which has units that are the same as the data.’. A measure of the dispersion of a set of data from its mean. The more spread apart the data, the larger these deviations. A simple measure of spread is the range, the difference between the two extreme points of the data. When outliers are present in the data, the IQR (inter quartile range) is often used. The IQR is the difference between the 75th (Q3, third quartile) and 25th (Q1, first quartile) percentile of the data
  • Designing and Conducting Research in Education
    • Clifford J. Drew, Michael L Hardman, John L. Hosp(Authors)
    • 2007(Publication Date)
    Ordinal data do not present such a clear situation as nominal data. Mean ranks found in research literature may or may not be appropriate depending on the situation. Ordinal data varies considerably in terms of how nearly the ranks represent amount or magnitude as well as order. In some cases, the data fit into a lower-level ordinal scale and represent only gross directionality. In such situations, a median is the preferred measure of central tendency. On other occasions, the ranks may represent considerably more concerning magnitude even though exact interval equivalence is not a certainty. Under these circumstances, the use of a mean is meaningful in terms of the property being measured to the degree that magnitude statements have some meaning. Again, researchers must remain cognizant of their purpose—that of being able to say something meaningful about the topic. Just manipulating numbers does not fulfill this purpose.
    The mean (average) is the most frequently used measure of central tendency and there are several reasons why. One important reason involves research purposes other than description. If a researcher is conducting an inferential study, he or she will likely want to compute additional analyses on the data. The mean is useful for further arithmetic manipulation and is therefore more useful for many of the additional operations necessary in inferential statistics. The median and mode, on the other hand, are more often terminal descriptive statistics. It should be noted that because of the arithmetic operations involved in computing a mean, interval or ratio data are preferred for this measure of central tendency, since they have the property of additivity.
    Depending on the shape of the distribution, the three central tendency measures may have the same or different score values. If the distribution is shaped like the example in Figure 12.2
  • Quantitative Techniques in Business, Management and Finance
    • Umeshkumar Dubey, D P Kothari, G K Awari(Authors)
    • 2016(Publication Date)
    A distribution in which mean, median and mode coincide is known as a symmetrical (bell-shaped) distribution. If a distribution is skewed (i.e. not symmetrical), then mean, median and mode are not equal. In a moderately skewed distribution, a very interesting relationship exists among mean, median and mode. In such type of distributions, it can be proved that the distance between the mean and median is approximately one-third of the distance between the mean and mode. This is shown below for two types of such distributions.
    In the case of a symmetrical distribution, the mean, median and mode coincide. However, according to Karl Pearson, if the distribution is moderately asymmetrical, the mean, median and mode are related in the following manner:
    Mean – Median = (Mean – Mode)/3 Thus, Mode = 3 Median – 2 Mean
    Similarly, we can express the approximate relationship for median in terms of mean and mode. Also, this can be expressed for mean in terms of median and mode. Thus, if we know any of the two values of the averages, the third average value can be determined from this approximate relationship.
    Exercise For a moderately skewed distribution in which the mean and median are 35.4 and 34.3, respectively, calculate the value of the mode.
    Solution To compute the value of the mode, we use the approximate relationship
    Mode
    3  Median 2  Mean
    = 3  
    ( 34.3 )
    2  
    ( 35.4 )
    = 102.9 70.8 = 32.1
    Therefore, the value of the mode is 32.1.
    Exercise Median = 139.69 Mean = 139.51 Calculate the mode.
    Solution Mode = 3Median – 2 mean Given: Median = 139.69 Mean = 139.51 Substituting the values, we get
    Mode
    = 3  
    ( 139.69 )
    2  
    ( 139.51 )
    = 419.07 279.02
    = 40.05

    3.9 Comparison of Mean and Median

    See Table 3.20 for a comparison of the arithmetic mean and median.

    3.10 Geometric Mean

    Managers often come across quantities that change over a period of time, and may need to know the average rate of change over this period. Arithmetic mean is inaccurate in tracing such a change. Hence, a new measure of central tendency is needed to calculate the change rate – the ‘geometric mean’.
  • Research Methods and Statistics in Psychology
    • Hugh Coolican(Author)
    • 2018(Publication Date)
    • Routledge
      (Publisher)
    modal value . It is the most frequently occurring category and therefore even easier to find than the mean or median. The mode of the set of numbers:

    Key Terms

    Mode/modal value
    Measure of central tendency – most frequent value in a data set.
    Bi-modal distribution
    Data set with two modes.
    • 1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 7, 7, 7, 8
    is therefore 5 since this value occurs most often. In the set of values 5, 2, 12, 1, 10 there is no single modal value since each value occurs once only. For the set of numbers 7, 7, 7, 8, 8, 9, 9, 9, 10, 10 there are two modes, 7 and 9, and the set is said to be bi-modal .
    In Table 13.2 the modal value or category is ‘student’. Be careful here to note that the mode is not the number of times the most frequent value occurs, but that value itself. ‘Student’ occurs most frequently.
    The mode is the typicality statistic to use with nominal-level data but is also often a more comfortable alternative with discrete measurement scales, avoiding the unrealistic ‘average’ of 1.999 legs (see p. 374) and giving us the typical statistic of 2.

    Advantages and disadvantages of the mode

    The mode is unaffected by outliers. It can be obtained when extreme values are unknown. It is often more informative than the mean when a scale is discrete. However, it doesn’t take account of the exact distances between values and it is not at all useful for relatively small sets of data where several values occur equally frequently (e.g., 1, 2, 3, 4). For bi-modal distributions two modal values need to be reported. The mode cannot be estimated accurately when data are grouped into class intervals. We can have a modal interval – such as 6–10 cigarettes in Table 13.11 – but this will change if differently sized intervals are used.

    Measures of dispersion

    Think back to the description of new evening-class mates. The central tendency was given as 25, but some ‘guesstimate’ was also given of the way people spread around this central point. Without knowledge of spread (or, more technically, dispersion ), a mean can be very misleading. Take a look at the bowling performance of two cricketers shown in Figure 13.5 . Both bowlers average around the middle stump but (a) varies much more than (b). The attempts of (a) are far more widely dispersed
  • Statistics
    eBook - ePub

    Statistics

    The Essentials for Research

    We are not concerned with how far a score is above or below this point. Except for the score or scores that locate the median (the N /2 score), the actual size of all other scores play no part in its calculation. We could, in fact, calculate a median for a distribution of scores even if the top score were partially illegible. We would only need to be certain that it was above the interval containing the median. The mean is quite different, Σ X is affected by any change in the value of any score. If μ = Σ X/N, any error or change in the value of a score will change Σ X and consequently change the mean. Notice that 2, 4, 6, 8, and 10 as well as 2, 4, 6, 8, and 100 have the same median (6), but quite different means. You now know three ways to describe the central tendency of distributions: modes, means, or medians. The mean is usually preferable as a measure of central tendency, and it forms the basis for more advanced statistical treatments. On the other hand, if the data are badly skewed, the mean will be a very misleading description of central tendency and we might prefer to use the median. 3.12 Review Distributions can differ in a variety of ways. They can differ in size, indicated by N, the number of measurements on which they are based. They can differ in central tendency, measured by the mean, the sum of all measures divided by their number; or by the median, the middle score in a distribution. They can differ in skewness, with positively skewed distributions having the “tail” to the right or high end of the continuum, negatively skewed distributions having the “tail” toward the left, or low end of the continuum. Distributions with zero skew are symmetrical. Calculation of centile points other than the median was explained, as was the determination of centile ranks
  • Stats Means Business
    eBook - ePub

    Stats Means Business

    Statistics and Business Analytics for Business, Hospitality and Tourism

    • John Buglear(Author)
    • 2019(Publication Date)
    • Routledge
      (Publisher)
    The whole point of using a measure of location is that it should convey an impression of a distribution in a single figure. If we need to communicate this to an audience, it won’t help if we quote the mode, median and mean and then invite our audience to please themselves which one to pick. It is important to use the right sort of average.
    Picking which average to use might depend on a number of factors:
    • The type of data we are dealing with.
    • Whether the average needs to be easy to find.
    • The shape of the distribution.
    • Whether the average will be the basis for further work on the data.
    As far as the type of data is concerned, unless you are dealing with fairly simple discrete data, the mode is redundant. If you do have such data to analyse, the mode may be worth considering, particularly if it is important that your measure of location is a feasible value for the variable to take.

    Example 3.6

    The numbers of days that 16 employees were absent through illness were:
    Find the mode, median and mean for this set of data. The modal value is 1, which occurs six times. Array:
    The median position is: (16 + 1)/2 = 8.5th position The median is: (8th value + 9th value)/2 = (1 + 2)/2 = 1.5
    The arithmetic mean =
    (
    0 + 0 + 1 + 1 + ….. +   4 + 6
    )
    / 16 = 32 / 16 = 2 .0625
    In Example 3.6 it is only the mode that has a value that is both feasible and actually occurs, 1. Although the value of the median, 1.5, may be feasible if the employer recorded half-day absences, it is not one of the observed values. The value of the mean, 2.0625, is too precise to be feasible and therefore cannot be one of the observed values.
    The only other reason you might prefer to use the mode rather than the other measures of location, assuming that you are dealing with discrete data made up of a relatively few different values, is that it is the easiest of the measures of location to find. All you need to do is to look at the data and count how many times the values occur. Often with the sort of simple data that the mode suits, it is pretty obvious which value occurs most frequently and there is no need to count the frequency of each value.
  • Business Statistics Using Excel
    eBook - ePub

    Business Statistics Using Excel

    A Complete Course in Data Analytics

    5 Median and Mode
    DOI: 10.4324/9781032671703-5

    Learning Objectives

    After reading this chapter, you will be able to
    • Know the concept of median, which is the middlemost observation of a given set of data.
    • Understand the method of computing the median of ungrouped data.
    • Determine the median of grouped data.
    • Understand the concept of mode of a given set of data (observations), which has the maximum frequency.
    • Compute the mode of ungrouped data.
    • Determine the mode of grouped data.
    • Analyse percentile of a given set of data.
    • Understand quartile and its classification.

    5.1 Introduction

    This chapter covers all the essential details and examples of the median, mode, percentile, and quartile. The median for ungrouped data and the median for grouped data are further distinguished in the section on the median. The mode portion is further divided into two categories: mode for grouped data and mode for ungrouped data. Excel worksheets are used to illustrate each of these.
    If the data repeat, it is possible to create a frequency distribution for the observations of the data. Excel automatically creates this type of frequency distribution before determining the median and mode of the provided set of data; thus the investigator is not required to do so. However, if the total frequency is high, it will be difficult to input every instance in the Excel sheet. The investigator can use Excel to tackle this issue by creating a worksheet that is appropriate.

    5.2 Median

    A given set of data, such as the sales revenues of a corporation over the past several years, can be determined using the median function, which finds the middlemost observation in the set [1 ].
    The actual data could either be ungrouped or grouped. As a result, this section is divided into subsections for the median of ungrouped and grouped data.

    5.2.1 Median of Ungrouped Data Using Median Function

    There are some real-life situations in which the instances of a variable mostly may not repeat. Hence, there may not be frequencies for such data. This sets an example of ungrouped data. Such data are arranged per their increasing order.
  • Statistics for Business
    4 Measures of Central Tendency (MCT) 4.1      Introduction
    Statistical methods are needed for summarizing and describing the collected numerical data. The main objective of this chapter is to introduce one representative value that can be used to identify and summarize an entire set of data. This representative value is going to be helpful in making decisions based on the data collected. Measures of central tendency (MCT) are used to set the central value around which the data are spread over.
    4.2      MCT
    The average of a distribution is its representative size. As most of the items of the series cluster around the average, it is called a ‘measure of central tendency’. The average is computed to reduce the complexity of the data. The entire distribution is reduced to one number, which can be considered typical of an important characteristic of the population, and the same can be used in making comparisons and in examining relations with other distributions.
    For example, it is not possible to remember the individual’s income in crores of earning people in India. By considering all the data related to income, if the average income is evaluated, we get a single value, which is going to represent the entire population. The commonly used averages are as follows:
    •  Arithmetic mean •  Median •  Mode •  Geometric mean •  Harmonic mean 4.2.1      Properties of Best Average It should be •  Rigidly defined. •  Based on all observations of the series. •  Easy to calculate and simple to understand. •  Capable of further algebraic treatment. •  Free from the extreme values (i.e., it should not be affected by extreme values).
    The arithmetic mean is ideal in these respects, even though the other averages are also useful in certain specific cases. The median is quite useful for studying data not capable of direct quantitative measurement like skin colour, etc. The measure mode is good when the extreme values are not well defined.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.