Mathematics

Median

The median is a measure of central tendency in a set of numbers. It is the middle value when the numbers are arranged in ascending or descending order. If there is an even number of values, the median is the average of the two middle numbers.

Written by Perlego with AI-assistance

10 Key excerpts on "Median"

  • Sensory Evaluation of Food
    eBook - ePub

    Sensory Evaluation of Food

    Statistical Methods and Procedures

    • Michael O'Mahony(Author)
    • 2017(Publication Date)
    • CRC Press
      (Publisher)
    For the mean to be used, the spread of the numbers needs to be fairly symmetrical; one very large number can unduly influence the mean so that it ceases to be a central value. When this is the case, the Median or the mode can be used.
    The Median
    The Median is the middle number of a set of numbers arranged in order. To find a Median of several numbers, first arrange them in order and then pick out the one in the middle.
    Example 1 Consider
    1 2 2
    3 ¯
    4 7 7
    Here the number in the middle is 3. That is, the Median = 3. Example 2 Consider
    1 2 3 : 3 3 4
    Here there is no one middle number, there are two: 3 and 3. The Median in this case is the number halfway between the two middle numbers. So the Median = 3. Example 3 Consider
    1 2 2 : 3 8 90
    Again, there are two middle numbers: 2 and 3. The number halfway between these is 2.5. So the Median = 2.5. Note how the extreme value of 90 did not raise the Median the way it would have the mean.
    The Mode
    The mode is the most frequently occurring value
    Example 4 Find the mode of
    1 ,
    1 ,
    2 ,
    2 ,
    3 ,
    3 ,
    3 ,
    4 ,
    5 ,
    6 ,
    7 ¯
    ,
    7 ¯
    ,
    7 ¯
    ,
    7 ¯
    ,
    8 ,
    8
    The most commonly occurring number is 7; it occurs four times. So the mode is 7.
    Of course, it is possible for a set of numbers to have more than one mode, or if no number occurs more than any other, no mode at all. A distribution with two modes is said to be bimodai, three modes, trimodal\ and so on.

    2.2 Which Measure of Central Tendency Should you Use?

    The mean is the most useful statistic when inferences are to be drawn about a population from a sample; the mean of a sample is the best estimate of the mean of the population. It is commonly used by statisticians but is affected unduly by extreme values. The mode is easily computed and interpreted (except where there are several possible modes) but is rarely employed; no common inferential procedure makes use of the mode. The Median is a useful descriptive statistic which fits the requirements of clear and effective communication, but it is of only limited use where inferences about population parameters are to be made from a sample. It is less affected by a preponderance of extreme values on one side of a distribution (skew) than the mean, and is to be preferred to the mean in cases where there are extreme values or values of indeterminable magnitude (e.g., when values are too big to be measured on the scale being used). The mean uses more information than either of the other measures, shows less fluctuation over successive samples, and is employed in all classical (parametric: see Section 2.7
  • Data Analytics
    eBook - ePub

    Data Analytics

    Systems Engineering - Cybersecurity - Project Management

    th percentile based on whether that value is above or below the Median. There will be more on this later in the text, but the Median is probably one of the most important values for the analyst (besides the mean).
    The formula for revealing the Median is the actual process of finding the Median, irrespective of the tools employed. Once the data is sorted from least to greatest, the center of that sorting is the Median. Does it sound too easy? That is because not all datasets, once sorted, end up with just one number in the center of the set. For instance, if the number of values is 9, then the 5th number is the Median. Simple enough for this example, but what happens if the number of values is 10? That means that the 5th and 6th values are the center of the set. In that case, the Median would be the mean of those two numbers. This brings us back to the mean, which is the total of the values divided by the number of values. For instance, if the center of a 10-value dataset is the numbers 9 and 11, then the Median would be 9 + 11 divided by 2. Why 2, when it is a 10-value set? Because the analyst only wants to find the arithmetic mean between the two physically centered values. This sounds complicated, but in actuality, this works every time. The Median in this case is 20/2 or 10. It makes sense when the analyst analyzes the different case studies using the Median.
    In this exercise, the analyst is faced with 5 house prices, not sorted in order. The first thing to do in order to find the Median is to sort these values going from least to greatest. This would give the analyst the following:
    50,000; 75,000; 100,000; 150,000, 300,000
    From this list, the analyst would pick the middle number, which is 100,000. This number represents the Median of this dataset. To show you the difference between the Median and the mean, the mean of the dataset is 135,000. This is quite a difference from 100,000 and shows how much just a few numbers on either side of the dataset can “draw” the average to one side or the other. In this case, the 300,000 drew the average to the right of the Median. Later on, we focus on how the dataset is “skewed” to the right if the mean is to the right of the Median. This means that the dataset is not symmetrical and may not relate to a “normal” or standard distribution. More on this later.
  • Statistics Explained
    • Perry R. Hinton(Author)
    • 2014(Publication Date)
    • Routledge
      (Publisher)
    The Median is a good measure of central tendency as it picks up the score in the middle position of the distribution. Its weakness, if indeed it is a weakness, is that, like the mode, it does not use all the information given by the marks. The Median is simply the score where we cut our list into two halves. The marks either side of the Median could be anything below or above the Median respectively. If we found that someone who had been given a mark of 9 in the examination really had a mark of 29 or 39, correcting this score would not change the Median as 54 would still be the middle mark in the list. The Median would stay the same even if a number of marks were changed (as long as a mark below the Median was not changed to a value higher than the Median or vice versa). The Median doesn’t take account of the values of all the scores, only the value of the score at the middle position.
    While we might regard the Median as a better choice of a central value than the mode, as it finds the score at the middle position rather than the most frequent score, there is a third measure of central tendency that is used far more often than either of the above two measures. This is the
    mean
    .
    mean
    A measure of the ‘average’ score in a set of data. The mean is found by adding up all the scores and dividing by the number of scores.
    We express the formula for calculating the mean using special symbols. We use the Greek letter
    µ
    (pronounced ‘mu’) for the mean, the Greek letter capital sigma, Σ, to mean ‘the sum of (or ‘add up’), X to indicate a score (in our example, an examination mark) and N for the number of scores. The symbols ΣX mean ‘add up all the scores’. The mean,
    µ
    is the sum of the scores divided by N :
    When we talk of ‘an average’ we are usually referring to the mean (although the word ‘average’ is sometimes used more loosely than the word ‘mean’, which has the statistical definition given above). To calculate the mean we add up all the marks and divide them by the number of students. Adding up all the marks we arrive at 5293. Dividing this by 100 gives us a mean of 52.93.
    One way of thinking about the mean is by analogy with a seesaw. Imagine that the horizontal axis of our frequency distribution is a beam of wood going from 0 to 100 in length. Each of the marks is a student sitting on the beam at the position specified by their mark (so there are three students sitting on the beam at 56 and one at 74, etc.). Where would you have to put a supporting post under the beam to make a perfectly balanced seesaw? The answer is at the mean position. We can see it as the value that balances the scores either side of it. Any change in the marks (we move a student along the beam) results in a change in the mean (the seesaw will tip to one side unless we move the supporting post to a new position to restore balance). So the mean is a statistic that is sensitive to all the scores about it, unlike the Median, as we saw above.
  • Finite Math For Dummies
    • Mary Jane Sterling(Author)
    • 2018(Publication Date)
    • For Dummies
      (Publisher)
    This seems to be reasonable. Eleven of the states have names with eight letters, and that number of letters seems to be in the middle of the ordered list. With that in mind, consider the next situation. A company owner claims that his employees earn an average salary of $76,000. This is true. But does it represent the expected value or a fair representation of what people make at his business? The 30 salaried people earn the following amounts: Adding all the salaries together and dividing by 30, you get Yes, the math is correct, but this is not a good representation of what people are making there. Only one person is earning more than $70,000 — and you can probably guess who. The $1,000,000 salary is an outlier — it distorts the picture when the mean average is used. Riding down the middle with the Median The Median is another measure of central tendency. It’s the middle number in a data list that has been put in order either smallest to largest or largest to smallest. The Median of the set of numbers is the middle number in in the list if n is an odd number. If n is even, then find the mean average of the two numbers in the middle. Consider the set of numbers: {2, 3, 3, 3, 3, 4, 4, 4, 5, 6, 7}. This set has 11 numbers, so the sixth number is in the middle: 2, 3, 3, 3, 3, 4, 4, 4, 5, 6, 7 The Median is 4. Now look at the same set after deleting the number 7; there are now ten numbers in the set: {2, 3, 3, 3, 3, 4, 4, 4, 5, 6}. The fifth and sixth numbers are in the middle. 2, 3, 3, 3, 3, 4, 4, 4, 5, 6 You find the mean average of 3 and 4:. The Median is 3.5. This number doesn’t appear in the list, but a particular measure of central tendency is often not listed in the data set being considered. Making the most of the mode The mode, if there is one, is the number that occurs most often in a data set. There can be one mode, no mode, or many modes. The mode is another measure of central tendency. Unlike the mean and Median, the mode can be used in non-numerical sets
  • Decision Making Process
    eBook - ePub

    Decision Making Process

    Concepts and Methods

    • Denis Bouyssou, Didier Dubois, Henri Prade, Marc Pirlot, Denis Bouyssou, Didier Dubois, Henri Prade, Marc Pirlot(Authors)
    • 2013(Publication Date)
    • Wiley-ISTE
      (Publisher)
    p + 1)th individual satisfies this property and is therefore a Median.
    We have a ‘Median interval’ (in this case, statisticians often choose as Median age the mean between the two ages that are the bounds of the Median interval). Two observations can be made on the Median(s) of a distribution:
    1) A Median is a solution of an optimization problem: it minimizes the sum of its distances to the different values taken by the variable on the population (these values being weighted by their number of occurrences). This is a consequence of a more general result, due to Laplace, on the Median of a probability distribution [LAP 74].
    2) At least when the Median is unique, it can be obtained by an algebraic expression using the operations Max and Min. For instance, in the simplest case where the values of the variable for three individuals are a , b , c with a < b < c , the Median b is given by the formula b = min[max(a , b ),max(b , c ),max(c , a )]. (When there is a Median interval, its two bounds are given by algebraic formulas.)
    The first observation leads to the notion of metric Median. In a metric space (E , d ), a Median of a v -tuple (t 1 , t 2 ,…,
    tv
    ) is an element t of E minimizing the d (t ,
    ti
    ) of the distances of t to the elements of the v -tuple. This is in fact an old notion since it appears in a famous challenge proposed by Fermat in his Essai sur les maximas et les minimas [FER 29]: “Let he who does not approve of my method attempt the solution of the following problem: given three points of the plane, find a fourth point such that the sum of its distances to the three given points is a minimum”. Here, the distance between two points P and Q is the length of the segment PQ
  • Statistics For Dummies
    • Deborah J. Rumsey(Author)
    • 2016(Publication Date)
    • For Dummies
      (Publisher)
    Bottom line: The mean doesn’t always tell the whole story. In some cases it may be a bit misleading, and this is one of those cases. That’s because every year a few top-notch players (like Kobe) make much more money than anybody else, and their salaries pull up the overall average salary.
    Numbers in a data set that are extremely high or extremely low compared to the rest of the data are called outliers . Because of the way the average is calculated, high outliers tend to drive the average upward (as Kobe’s salary did in the preceding example). Low outliers tend to drive the average downward.

    Splitting your data down the Median

    Remember in school when you took an exam, and you and most of the rest of the class did badly, but a couple of nerds got 100? Remember how the teacher didn’t curve the scores to reflect the poor performance of most of the class? Your teacher was probably using the average, and the average in that case didn’t really represent what statisticians might consider the best measure of center for the students’ scores.
    What can you report, other than the average, to show what the salary of a “typical” NBA player would be or what the test score of a “typical” student in your class was? Another statistic used to measure the center of a data set is called the Median. The Median is still an unsung hero of statistics in the sense that it isn’t used nearly as often as it should be, although people are beginning to report it more nowadays.
    The Median of a data set is the value that lies exactly in the middle when the data have been ordered. It’s denoted in different ways; some people use M and some use . Here are the steps for finding the Median of a data set:
    1. Order the numbers from smallest to largest.
    2. If the data set contains an odd number of numbers, choose the one that is exactly in the middle. You’ve found the Median.
    3. If the data set contains an even number of numbers, take the two numbers that appear in the middle and average them to find the Median.
  • Effective Management in Practice
    eBook - ePub

    Effective Management in Practice

    Analytical Insights and Critical Questions

    In a world of wide variability of specific observations, be it firm annual profitability or indeed academic citations, we have to be wary of the most common form of average: the mean. One response, particularly when there are significant outliers, is to look at measures such as Medians or modes which are relatively robust to such extreme events.
    Mean
    The average value, calculated by adding all the observations and dividing by the number of observations.
    Median
    Middle value of a list. If you have numbers 2, 3, 4, 5, 6, 7 and 8, the Median is 5. Other definitions include the smallest number such that at least half the numbers in the list are no greater than it.
    Mode
    For lists, the mode is the most common (frequent) value.
    On the other hand, there are also two further general considerations:
    •    We may well be attempting to average over two or more distinct categories or groupings. Sometimes this can be rather misleading such as in Tony O’Reilly’s brief encounter with a group of farmers in his early days with the Irish Dairy Board when, as he used to tell the story, they regaled him with the assertion that 75 per cent of them were above average – that is the average of the other 25 per cent!
    •    In a relatively high variance world any form of average figure on its own tells us little without some measure of dispersion. The most common dispersion measures are range, interquartile range, variance and standard deviation.
    Range
    The range is the simplest measure of variability to calculate, and is simply the highest score minus the lowest score.
    Interquartile Range
    The interquartile range (IQR) is the range of the middle 50 per cent of the scores in a distribution.
    Variance
    The variance is defined as the average squared difference of the scores from the mean.
    Standard Deviation
    The standard deviation is simply the square root of the variance. In the “normal” case 95 per cent of the distribution is within two standard deviations of the mean.
  • Stats Means Business
    eBook - ePub

    Stats Means Business

    Statistics and Business Analytics for Business, Hospitality and Tourism

    • John Buglear(Author)
    • 2019(Publication Date)
    • Routledge
      (Publisher)
    bimodal – that is, it has two modes. If another person aged 32 joined the workforce, there would be three modes. The more modes there are, the less useful the mode is to use. Ideally, we want a single figure as a measure of location to represent a set of data.
    If you want to summarise a set of continuous data, using the mode is even more inappropriate; usually continuous data consist of different values, so every value would be a mode because it occurs as often as every other value. If two or more observations take exactly the same value it is a fluke.

    3.2.2 The Median

    Whereas you can only use the mode for some types of data, the second type of average or measure of location, the Median, can be used for any set of data.
    The Median is the middle observation in a set of data. We find the Median by first arranging the data in order of magnitude – that is, listed in order from the lowest to the highest values. Such a list is called an array. Each observation in an array may be represented by the letter ‘x’ and the position of the observation in the array is put in round brackets, for instance x(3) would be the third observation in the array and x
    (n)
    would be the last.

    Example 3.3

    Find the Median of the data in Example 3.1 .
    Array:
    Since there are 15 observations, the middle one is the 8th, the first 18, which is shown in bold type. There are seven observations to the left of it in the array and seven observations to the right of it.
    You can find the exact position of the Median in an array by taking the number of observations, represented by the letter n, adding one and then dividing by two.
    Median position =
    (
    n + 1
    )
    / 2
    In Example 3.3 there are 15 observations, that is, n = 15, so:
    Median position = ( 15 + 1 ) / 2 = 16 / 2 = 8
    The Median is in the 8th position in the array, in other words the 8th highest value, 18. The Median age of these workers is 18.

    Example 3.4

    Find the Median of the data in Example 3.2
  • Statistics for Business
    Median of the 5 employees’ income is, $30, $35, $40, $45 and $50 is $40. If we change the value of the fifth item $50 by $100, still the Median is $40. 3.  The Median can be evaluated even if the data are incomplete.
    Example:
    For the preceding problem, it is possible to evaluate the Median, but the arithmetic mean cannot be evaluated. 4.  The Median may be located when the items in a series cannot be measured quantitatively like the fairness of the skin and intelligence. 4.5.3.2      Relative Disadvantages 1.  With the Medians of 2 groups, the overall Median cannot be evaluated. 2.  If there is a high degree of variation among the data set, Median cannot be viewed as a representative.
    Example:
    Median of 10, 20, 30, 100, 1000, 2000, and 3000 is 100, which is not a representative of the group. 3.  It cannot be considered as a representative when there are a few items.
    Example:
    Given are the per day salary of a batch of employees in dollars. Find out the Median salary $28, $44, $50, $30, $22, $63, $58, $52, $60, $23, $32, $57, $62, $39, $24, $41, $31, $20, $61, $38, $59, $46, $48, $37, $45.
    The data type is DD. Place all the given 25 values in ascending order. 20, 22, 23, 24, 28, 30, 31, 32, 37, 38, 39, 41, 44, 45, 46, 48, 50, 52, 57, 58, 59, 60, 61, 62, 63. Select the middle item. Here it is the 13th item, which is 44. Hence, the Median is 44. The Median of the per day salary is $44.
    Example:
    Find the Median from the following table: The data type is DDF. Based on the given table, construct the cumulative frequency table.
    X ($)
    f
    Cumulative Frequency
    22     80     80
    27   166   246
    32   298   544
    37   507 1051
    42   605 1656
    47   700 2356
    52   630 2986
    57   450 3436
    62   190 3626
    Total 3626
    n / 2
    =
    3626 / 2
    = 1813
    Here, the cumulative frequency just greater than 1813 is 2356.
    Hence, the Median is the value of X corresponding to the cumulative frequency 2356. The Median is $47.
    Example:
    Evaluate the Median value, after classifying the data given into a continuous one with class length of 10 marks:
  • Social Statistics
    eBook - ePub

    Social Statistics

    Managing Data, Conducting Analyses, Presenting Results

    • Thomas J. Linneman(Author)
    • 2021(Publication Date)
    • Routledge
      (Publisher)
    If the variable is measured at the nominal level, the only center we can find is the mode. For example, if we were examining the types of schools students attend (public, private secular, private religious), the type of school with the highest frequency of children would be the mode. If the variable is measured at the ordinal level, there is a mode, but we can also find the Median. This is because the Median requires ordered data, and we can order the categories of an ordinal-level variable. For example, if we are using an ordinal-level measure of educational achievement (less than high school, high school, some college, college degree, graduate degree), we would be able to say, hypothetically, that the modal respondent graduated from high school and the Median respondent had some college. If we had a ratio-level variable, then we could find all three measures of the center: the mode, the Median, and the mean. For example, if we were measuring educational achievement in years, we might find that the mode was 12 years, the Median was 13 years, and the mean was 13.27 years of education. If you want to make sure that you have the ability to measure a variable at all three centers, then you have to make sure that you measure that variable at the ratio level.

    Close Relatives of the Median: Quartiles and Percentiles

    The Median, with half the cases above and half below, can be considered by another name: the 50th percentile: 50% of the cases are above and 50% are below. Often in the real world, you will hear other percentiles. A common use of percentiles in today’s world is those unsavory standardized tests. If you received your grades, and found out that you were at the 50th percentile, this would mean that 50% of the people who took the test scored lower than your score, and 50% of the people who took the test scored higher than your score. If you found out you were at the 93rd percentile, this would mean that 93% of the people who took the test scored lower than your score, and 7% of the people who took the test scored higher than your score. Although the 50th percentile is the most often used of the percentiles, people sometimes use the 25th percentile and 75th percentile. With these three percentiles, people sometimes organize their results into quartiles:
    • First quartile: from the lowest case to the 25th percentile
    • Second quartile: from the 26th percentile to the 50th percentile
    • Third quartile: from the 51st percentile to the 75th percentile
    • Fourth quartile: from the 76th percentile to the highest case
    This provides a nice way to convert a large set of values to a mere four categories. Let’s use the frequency distribution from the GSS Black women TV watching example from before. Here it is again, with the Cumulative Percent column added:
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.