The science of statistics supports making better decisions in business, economics, and other disciplines. In addition, it provides necessary tools to summarize data, analyze data, and make meaningful conclusions to arrive at these better decisions. This book is a bit different from most statistics books in that it will focus on using data to help you do businessâthe focus is not on statistics for statisticsâ sake but rather on what you need to know to really use statistics.
To help in that endeavor, examples will include the use of the R programming language which was created specifically to give statisticians and technicians a tool to create solutions to standard techniques as well as custom solutions.
Statistics is a word originating from the Italian word stato meaning âstate,â and statista is an individual saddled with the tasks of the state. Thus, statistics is the collection of useful information to the statista. Its application began in Italy during the 16th century and dispersed to other countries of the world. At present, statistics covers a wide range of information in every aspect of human activity. It is not limited to the collection of numerical information but includes data summarization, analysis, and presentation in meaningful ways.
Statistical analysis is mainly concerned with how to make generalizations from data. Statistics is a science that deals with information. In order to perform statistical analysis on information (data) you have on hand or that you collect, you may need to transform the data or work with the data to get it into a form where it can be analyzed using statistical techniques. Information can be found in qualitative or quantitative form. In order to explain the difference between these two types of information, letâs consider an example. Suppose an individual intends to start a business based on the information in Table 1.1. Which of the variables are quantitative and which are qualitative? The product price is a quantitative variable because it provides information based on quantityâthe product price in dollars. The number of similar businesses and the rent for business premises are also quantitative variables. The location used in establishing the business is a qualitative variable since it provides information about a quality (in this case a location, Nigeria or South Korea). The presence of basic infrastructure requires a (Yes or No) response; these are also qualitative variables.
Table 1.1:Business feasibility data.
| Product price | Number of similar businesses | Rent for the business premise | Location | Presence of basic infrastructure |
| US$150 | 6 | US$2000 | Nigeria | No |
| US$100 | 18 | US$3000 | South Korea | Yes |
A quantitative variable represents a number for which arithmetic operations such as averaging make sense. A qualitative (or categorical) variable is concerned with a decision. In a case where a number is applied to separate members of different categories of a qualitative variable, the assigned number is subjective but generally intentional. An aspect of statistics is that it is concerned with measurementsâsome quantitative and others qualitative. Measurements provide the real numerical values of a variable. Qualitative variables can be represented with numbers as well, but such a representation may be arbitrary but useful for the purposes at hand. For instance, you can assign numerics to an instance of a qualitative variable such as Nigeria = 1, and South Korea = 0.
1.1 Scales of Measurement
In order to use statistics effectively, it is helpful to view data in a slightly different way so that it can be analyzed successfully. In performing a statistical test, it is important that all data be converted to the same scale. For example, if your quantitative data is in meters and feet, you need to choose one and convert any data to that scale. However, in statistics there is another definition for scales of measurement. Scales of measurement are commonly classified into four categories, namely: nominal scale, ordinal scale, interval scale, and ratio scale.
-
Nominal Scale: In this scale, numbers are simply applied to label groups or classes. For instance, if a dataset consists of male and female, we may assign a number to them such as 1 for male and 2 for female. In this situation, the numbers 1 and 2 merely denote the category in which a data point belongs. The nominal scale of measurement is applied to qualitative data such as gender, geographic classification, race classification, and so on.
-
Ordinal Scale: This scale allows data elements to be ordered based on their relative size or quality. For example, buyers can rank three products by assigning them 1, 2, and 3, where 3 is the best and 1 is the worst. The ordinal scale does not provide information on how much better one product is compared to others, only that it is better. This scaling is used for many purposesâsuch as grading, either stanine (1 to 9) or A to F (no E) where a 4.0 is all As and all Fs are 0.0; or rankings, such as on Amazon (1 to 5 stars, so that that 5.0 would be the highest ranking), or on restaurants, hotels, and other data sources which may be ranked on a four- or five-star basis. It is therefore quite important to know the data range when using this data. The availability of ...