PART ONE
Introduction
The first three chapters contain a variety of important introductory material.
Chapter 1 begins with some typical multivariate problems, and discusses ways of tackling them. The chapter also includes a section revising the more important results in matrix algebra.
Chapter 2 introduces the reader to multivariate probability distributions.
Chapter 3 shows how to carry out a preliminary analysis of a set of data. In particular, the chapter discusses how to feed the data into a computer without making too many errors, and also discusses the calculation of summary statistics such as means, standard deviations and correlations.
CHAPTER ONE
Introduction
Multivariate data consist of observations on several different variables for a number of individuals or objects. Data of this type arise in all branches of science, ranging from psychology to biology, and methods of analysing multivariate data constitute an increasingly important area of statistics. Indeed, the vast majority of data is multivariate, although introductory statistics courses naturally concentrate on the simpler problems raised by observations on a single variable.
1.1 Examples
We begin with some examples of multivariate data.
(a) Exam results. When students take several exams, we obtain for each student a set of exam marks as illustrated in Table 1.1. Here the āvariablesā are the different subjects and the āindividualsā are the students.
The analysis of data of this type is usually fairly simple. Averages are calculated for each variable and for each individual. The examiners look at the column averages in order to see if the results in different subjects are roughly comparable, and then examine the row averages in order to rank the individuals in order of merit. This ranking is usually achieved by simply ordering the average marks of the students, though some more complicated averaging procedure is sometimes used. If the results for one exam appear to be out of line with the remainder, then these marks may be adjusted by the examiners. For example, in Table 1.1 the mathematics average is somewhat low and it might be deemed fair to scale the mathematics marks up in some way.
Table 1.1 Some typical exam results
Although the above analysis is fairly trivial, it does illustrate the general point that multivariate analyses are often concerned with finding relationships, not only between variables, but also between individuals.
A more sophisticated analysis might try to establish how particular students did particularly badly or particularly well, and also see if results in different subjects are correlated. For example, do students who perform above average in science subjects tend to perform below average in arts subjects (i.e., is there negative correlation)?
(b) A nutritional study. A survey was recently carried out on children in Nepal to examine the use of three measured variables in assessing nutritional status. The variables were height, weight and middle-upper-arm circumference (MUAC for short). The data also recorded the sex of each child (coded as 1 for males and 2 for females), the age of each child, and the social caste of each child (coded from...