Mathematics

T-distribution

The T-distribution, also known as Student's t-distribution, is a probability distribution that is used in statistics. It is similar to the normal distribution but is better suited for smaller sample sizes. The shape of the t-distribution depends on the sample size, with smaller sample sizes resulting in heavier tails.

Written by Perlego with AI-assistance

11 Key excerpts on "T-distribution"

  • Statistics for the Behavioural Sciences
    eBook - ePub

    Statistics for the Behavioural Sciences

    An Introduction to Frequentist and Bayesian Approaches

    • Riccardo Russo(Author)
    • 2020(Publication Date)
    • Routledge
      (Publisher)
    The Student's t -distribution, and the application of the t -test to the above type of problem will also be considered. The t -test, and its associated t -distribution, are used to examine hypotheses about means when the standard deviation of the population of the individual observations is not known. In the majority of cases where means are compared, we do not know the population standard deviation, so it is estimated using the sampled data, and the t -test is an appropriate way to test hypotheses about the mean. Then, we will show how to use sample data to construct intervals that have a given probability of containing the true population mean, to calculate indexes of the size of the effect of the independent variable on the dependent variable and to use this in performing statistical power analysis for the one-sample t -test. 7.2 The sampling distribution of the mean and the Central Limit Theorem Imagine we know that the distribution of the population of individual scores in a manual dexterity test is normal with µ = 50 and σ = 8. Now imagine that we draw an infinite number of independent samples, each of 16 observations, from this population. We then record these means and plot their values. What would the distribution of these sample means look like? It turns out that the distribution of these sample means (also called the sampling distribution of the mean) is normal with a mean of 50 (i.e., equal to the mean of the distribution of the population of the individual scores; the mean of the sampling distribution of the mean is usually denoted as μ x ¯) and a standard deviation of 2 (note that the standard deviation of the sampling distribution of the mean is usually called the standard error of the mean and is denoted as σ x ¯)
  • Statistics For Dummies
    • Deborah J. Rumsey(Author)
    • 2016(Publication Date)
    • For Dummies
      (Publisher)
    t- distribution is typically used to study the mean of a population, rather than to study the individuals within a population. In particular, it is used in many cases when you use data to estimate the population mean — for example, to estimate the average price of all the new homes in California. Or when you use data to test someone’s claim about the population mean — for example, is it true that the mean price of all the new homes in California is $500,000?
    These procedures are called confidence intervals and hypothesis tests and are discussed in Chapters 13 and 14 , respectively.
    The connection between the normal distribution and the t- distribution is that the t- distribution is often used for analyzing the mean of a population if the population has a normal distribution (or fairly close to it). Its role is especially important if your data set is small or if you don’t know the standard deviation of the population (which is often the case).
    When statisticians use the term T-distribution, they aren’t talking about just one individual distribution. There is an entire family of specific t- distributions, depending on what sample size is being used to study the population mean. Each t- distribution is distinguished by what statisticians call its degrees of freedom. In situations where you have one population and your sample size is n, the degrees of freedom for the corresponding t- distribution is n – 1. For example, a sample of size 10 uses a t- distribution with 10 – 1, or 9, degrees of freedom, denoted t 9 (pronounced tee sub-nine ). Situations involving two populations use different degrees of freedom and are discussed in Chapter 15 .

    Discovering the effect of variability on t -distributions

    t-
  • New Statistical Procedures for the Social Sciences
    eBook - ePub

    New Statistical Procedures for the Social Sciences

    Modern Solutions To Basic Problems

    o )/s, and it is called Student's t distribution. (Gossett published his results under the name “Student.”) As will be seen, Student's t distribution plays an important role in many statistical procedures.
    Theorem 6.2.1 . If X1 , …, Xn is a random sample from a normal distribution, and if and s2 are the resulting sample mean and sample variance, then and s2 are independent random variables.
    The validity of Theorem 6.2.1 is far from obvious. Indeed you would expect it to be false since s2 =Σ(Xi - )2 /(n-1) which is an expression involving . In fact, in general, s2 and are dependent, but when normality is assumed they are independent.
    Definition 6.2.1 . Let Z be a standard normal random variable, and let Y be a chi-square random variable with v degrees of freedom where Z and Y are independene of one another. The distribution of
    is called a Student's t distribution with v degrees of freedom. (For convenience the subscript v will usually not be written.) The range of possible values of T is from ∞ to ∞. In addition, the distribution is symmetric about the origin, and its shape is very similar to the standard normal distribution. Percentage points of this distribution are given in Table A14. For instance, if v =10, Pr(T10 ≤1.812)=.95. Because the distribution is symmetric about the origin, Pr(T≤-t)-Pr(T≥t).
    If X1 , …, Xn is a random sample from a normal distribution, it has already been explained that ( n) ( -μ)/σ has a standard normal distribution, and (n-1)s2 2 has a chi-square distribution with v =n-1 degrees of freedom. Also, these two quantities are independent of one another, and so
    has a Student's t distribution with v degrees of freedom. Note that the right side of (6.2.1a) is just n (
  • Calculus and Statistics
    t; n). Given that t is a variate with density function f(t; n) and the entire set of real numbers as admissible range, find the mean, mode, and median of t. Prove that the distribution of t is symmetric with respect to 0. Using Table 4 find the value of t asked for, and the values of b in (d) and (e).
    a)
    b)
    c)
    d) Find b such that P(–btb) = 0.95 if t has density function f(t; 11).
    e) Find b such that P(0 ≤ tb) = 0.475, where the density function of t is f(t; 15).
    4 . Twelve randomly chosen tomatoes are weighed; it is found that the mean weight is 6 ounces with a variance of 9. Find the probability that the true mean weight of tomatoes lies between 5 and 7 ounces. Find b such that there is a 0.95 probability that the mean weight of tomatoes lies between 6 – b and 6 + b.
    5 . Twenty-five ball bearings are found to have an average radius of 5.001 inches with a variance of 0.0025. What is the probability that the mean radius is really less than 5? Compute this probability using both the t- and normal distributions. Compare your results.
    6 . A random sample of 10 elements from a certain population has a mean of 15 and a variance of 16.
    a) Find b such that there is a 0.95 probability that the population mean lies between 15 – b and 15 + b.
    b) Find the b asked for in (a), assuming that the sample contains 15 observations (and all other factors remain unaltered).
    c) Find the b asked for in (a), assuming that the sample contains 20 observations. As the number of elements in the sample increases, what happens to b ?
    10.3 MORE ABOUT THE t-DISTRIBUTION
    The t-distribution can also be used in testing whether or not two populations have the same mean, provided certain conditions are satisfied and the samples are small enough to warrant the use of the t
  • Essential Mathematics and Statistics for Forensic Science
    • Craig Adam(Author)
    • 2011(Publication Date)
    • Wiley
      (Publisher)
    t -test for the statistical comparison of experimental data, which will be discussed later in the chapter.
    9.1 The normal distribution
    We saw in Section 6.3 how frequency data, represented by a histogram, may be rescaled and interpreted as a probability density histogram. This continuous function evolves as the column widths in the histogram are reduced and the stepped appearance transforms into a smooth curve. This curve may often be described by some mathematical function of the x -axis variable, which is the probability density function. Where the variation around the mean value is due to random processes this function is given by an exact expression called the normal or Gaussian probability density function. The obvious characteristic of this distribution is its symmetric “bell-shaped” profile centred on the mean. Two examples of this function are given in Figure 9.1 .
    Interpreting any normal distribution in terms of probability reveals that measurements around the mean value have a high probability of occurrence while those further from the mean are less probable. The symmetry implies that results greater than the mean will occur with equal frequency to those smaller than the mean. On a more quantitative basis, the width of the distribution is directly linked to the standard deviation as illustrated by the examples in the figure. To explore the normal distribution in more detail it is necessary to work with its mathematical representation, though we shall see later that to apply the distribution to the calculation of probabilities does not normally require mathematics at this level of detail. This function is given by the following mathematical expression, which describes how it depends on both the mean value μ and the standard deviation σ:
  • Multivariate Statistical Modeling in Engineering and Management
    • Jhareswar Maiti(Author)
    • 2022(Publication Date)
    • CRC Press
      (Publisher)
  • Often, we encounter non-normal populations and/or populations with unknown standard deviation. Under such situations, the central limit theorem provides reasonable approximation of the distribution of large sample statistics.
  • The estimation of population parameters comprises computation of point estimate followed by interval estimation with the help of appropriate sampling distribution. Depending on the availability of information about population distribution, population standard deviation and sample size, different scenarios for estimation will arise (see Section 2.6 for details).
  • Equality of two population means or two population variances are also of interest under different situations. Depending on the availability of information about population distribution, population standard deviation, and sample size, different scenarios will arise and using appropriate sampling distribution, interval estimation is made (see Section 2.6 for details).
  • A hypothesis is a statement that is yet to be proven. In inferential statistics, the statement about population parameters is investigated through hypothesis test. The procedure involves (i) setting up null (H0 ) and alternate (H1 ) hypotheses, (ii) finding out the appropriate test statistic and its sampling distribution, (iii) computation of test statistic value and comparison with threshold (theoretical) value for certain significance level (say α = 0.05), and (iv) decision-making about the null hypothesis (H0 ).
  • In hypothesis testing, there could be three decision-making scenarios as (i) right decision: acceptance of H0 when it is true or rejection of H0 when it is false, (ii) type-I error (α): rejection of H0 when it is true, and (iii) type-II error (β): acceptance of H0
  • Statistics from A to Z
    eBook - ePub

    Statistics from A to Z

    Confusing Concepts Clarified

    • Andrew A. Jawlik(Author)
    • 2016(Publication Date)
    • Wiley
      (Publisher)
    t – The Test Statistic and Its Distributions

    Summary of Keys to Understanding

    1. t is a Test Statistic used in tests involving the difference between two Means.
    1. t is a measure of how likely it is that a difference in Means is Statistically Significant.
    1. There is not one t-Distribution, but a different t-Distribution for each value of the Degrees of Freedom, df. As df grows larger, the t-Distribution approaches the z-(Standard Normal) Distribution.
    1. t has a number of similarities to z and some key differences.
    1. Use t instead of z when
      • – the Standard Deviation (Sigma, σ) of the Population or Process is unknown
      • or the Sample size is small (n < 30)

    Explanation

    1. t is a Test Statistic used in tests involving the difference between two Means.
    t is also known as “Student's t” or the “t-statistic.”
    Test Statistic
    A Statistic is a property of a Sample. A Test Statistic is one that has an associated Probability Distribution (or associated family of Probability Distributions). So, for any value of the Test Statistic, we can determine the Probability of that value. More importantly, we know the Cumulative Probability of all values greater or less than that particular value. This is an essential part of Inferential Statistics, in which we estimate (infer) a Parameter (e.g., the Mean or Standard Deviation) of a Population or Process based on the corresponding Statistic of a Sample. Common Test Statistics are t, z, F, and χ2 (Chi-Square).
    the difference between two Means.
    One Mean is always the Mean of a Sample.
    The Second Mean can be either
    • – A specified Mean, such as a target Mean, a historical Mean or an estimate, or
    • – The Mean of a Sample from a different Population or Process than the first Mean, or
    • – A second Mean from the same test subjects (e.g., before and after some event).
    These three different types of the second Mean correspond to three different t-tests. See the articles t-tests – Part 1 and Part 2.
  • U Can: Statistics For Dummies
    • Deborah J. Rumsey, David Unger(Authors)
    • 2015(Publication Date)
    • For Dummies
      (Publisher)
    t -table.)
    Remember:
    It doesn’t take a super-large sample size for the values on the t -distribution to get close to the values on a Z -distribution. For example, when n = 31 and df = 30, the values in the t -table are already quite close to the corresponding values on the Z -table.
    Passage contains an image Chapter 12

    Sampling Distributions and the Central Limit Theorem

    In This Chapter
    Understanding the concept of a sampling distribution
    Putting the Central Limit Theorem to work
    Determining the factors that affect precision
    When you take a sample of data, it’s important to realize the results will vary from sample to sample. Statistical results based on samples should include a measure of how much those results are expected to vary. When the media reports statistics like the average price of a gallon of gas in the U.S. or the percentage of homes on the market that were sold over the last month, you know they didn’t sample every possible gas station or every possible home sold. The question is, how much would their results change if another sample were selected?
    This chapter addresses this question by studying the behavior of means for all possible samples and the behavior of proportions from all possible samples. By studying the behavior of all possible samples, you can gauge where your sample results fall and understand what it means when your sample results fall outside of certain expectations.

    Defining a Sampling Distribution

    A random variable is a characteristic of interest that takes on certain values in a random manner. For example, the number of red lights you hit on the way to work or school is a random variable; the number of children a randomly selected family has is a random variable. You use capital letters such as X or Y to denote random variables and you use small case letters x or y to denote actual outcomes of random variables. A distribution is a listing, graph, or function of all possible outcomes of a random variable (such as X ) and how often each actual outcome (x ), or set of outcomes, occurs. (See Chapter 9
  • Introduction to Statistics for Forensic Scientists
    • David Lucy(Author)
    • 2013(Publication Date)
    • Wiley
      (Publisher)
    4 The normal distribution In Section 3.2 we saw how the binomial distribution could be used to calculate probabilities for specific outcomes for runs of events based upon either a known probability, or an observed probability, for a single event. We also saw how an empirical probability distribution can be treated in exactly the same way as a modelled distribution. Both these distributions were for discrete data types, or for continuous types made into discrete data. In this section we deal with the normal distribution, which is a probability distribution applied to continuous data. 4.1 The normal distribution The normal distribution † is possibly the most commonly used continuous distribution in statistical science. This is because it is a theoretically appealing model to explain many forms of natural continuous variation. Many of the discrete distributions may be approximated by the normal distribution for large samples. Most continuous variables, particularly from biological sciences, are distributed normally, or can be transformed to a normal distribution. Imagine a continuous random variable such as the length of the femur in adult humans. The mean length of this bone is about 400 mm, some are 450 mm and some are 350 mm, but there are not many in either of these categories. If the distribution is plotted then we expect to see a shape with its maximum height at about 400 mm tailing off to either side. These shapes have been plotted for both the adult human femur and adult human tibia in Figure 4.1. The tibia in any individual is usually shorter than the femur, however, Figure 4.1 tells us that some people have tibias which are longer than other people’s femurs. Notice how the mean of tibia measurements is shorter than the mean of the femur measurements
  • Understanding Statistics
    • Bruce J. Chalmer(Author)
    • 2020(Publication Date)
    • CRC Press
      (Publisher)
    4

    Some Distributions Used in Statistical Inference

    4.1    Knowing the sampling distribution of a statistic allows us to draw inferences from sample data.

    Distributions in statistical inference

    We noted in Chapter 2 that knowing the sampling distribution of a statistic is the key to using the statistic to draw inferences about the parameter of interest. We noted also that the central limit theorem assures us that statistics calculated by summing or averaging have (at least approximately) normal sampling distributions.
    In this chapter we discuss the details of how to use the normal distribution to find the proportion of individual scores in any given interval. This will lay the groundwork for the inferential procedures of estimation and hypothesis testing which we cover in Chapter 5 . We also discuss the binomial distribution, which is another important distribution used in statistical inference. We conclude the chapter with a description of the relationship between the binomial and normal distributions.

    4.2    The standard normal distribution is used to find areas under any normal curve.

    Characteristics of normal distributions

    In Chapter 3 we noted that the mean and standard deviation completely specify a normal distribution. That is, once you know the mean and standard deviation of a distribution known to be normal in shape, you can say exactly what proportion of scores in the distribution are in any given range. Let’s consider how this is done.
    First, it is handy to consider some general characteristics. (In fact, you will find it convenient to memorize these characteristics of a normal distribution, since you will be using them very frequently.) Refer to Figure 4.1
  • Statistics
    eBook - ePub

    Statistics

    The Essentials for Research

    11
    The t Distribution
    In previous chapters we have discussed methods for determining the probability of obtaining any particular proportion of events in a sample, given the sample size and the proportion of events in the population. We also discussed methods for determining the probability that two or more randomly drawn, or two correlated sample proportions came from a common population. In each case we defined two or more mutually exclusive categories, counted the observations classified in these categories, and then used tests of significance based on these enumerations.
    When true measurement is possible the statistician has considerably more powerful tests of significance available than those we previously discussed. These are more powerful tests in part because they make use of the greater amounts of information provided when we measure the characteristics of objects rather than count objects that possess certain characteristics. For example, we can count the number of children finishing a problem in less than 5 seconds and report that .40 of the group met this standard, or we can measure the time taken by each child and report a group mean of 8.17 seconds. We obviously receive more information if we know that a child took 8.17 seconds to solve a problem than if we know he required “more than 5 seconds.” This increase in information permits more powerful tests of significance than those we discussed in previous chapters.
    11.1 Hypotheses about the Population Mean
    Suppose we have an infinitely large population of measurements and we want to know if it is reasonable to assume that the population mean (μ
    H
    ) is 100. If the population is infinitely large, we cannot compute the population mean but we can draw a sample of n cases and compute a sample mean .
    If we continue to select samples and calculate sample means, we can construct a sampling distribution of sample means. This process was discussed briefly in Chapter 7 , and Table 7.1
  • Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.