Working With Sample Data
eBook - ePub

Working With Sample Data

  1. 162 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Working With Sample Data

Book details
Book preview
Table of contents
Citations

About This Book

Managers and analysts routinely collect and examine key performance measures to better understand their operations and make good decisions. Being able to render the complexity of operations data into a coherent account of significant events requires an understanding of how to work well with raw data and to make appropriate inferences. Although some statistical techniques for analyzing data and making inferences are sophisticated and require specialized expertise, there are methods that are understandable and applicable by anyone with basic algebra skills and the support of a spreadsheet package. By applying these fundamental methods themselves rather than turning over both the data and the responsibility for analysis and interpretation to an expert, managers will develop a richer understanding and potentially gain better control over their environment. This text is intended to describe these fundamental statistical techniques to managers, data analysts, and students. Statistical analysis of sample data is enhanced by the use of computers. Spreadsheet software is well suited for the methods discussed in this text. Examples in the text apply Microsoft Excel. Readers will have access to the example workbooks and Adobe Flash videos illustrating key steps using Microsoft Excel from the Business Expert Press website.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Working With Sample Data by Priscilla Chaffe-Stengel in PDF and/or ePUB format, as well as other popular books in Business & Decision Making. We have over one million books available in our catalogue for you to explore.

Information

Year
2011
ISBN
9781606492147
Chapter 1
Depicting Data in Telling Ways
Data flow from the pulse of business: your operations, your business, your market, your industry. Managers and analysts routinely collect and examine key performance measures to better understand their operations and make good decisions. Statistics is the study of principles and methods for exploring data and making correct conclusions.
Developing a full and articulated understanding of key performance measures often begins with a descriptive summary. Descriptive summaries can be developed graphically, which we address here, or numerically, which we develop in the next chapter.
Descriptive Narration
Graphic summaries provide a picture of key data that can be used to elicit important questions, fuel understanding, and facilitate communication. Important data tell stories worth listening to and worthy of retelling. Good graphics capture the detail in the data as well as the overview of their story. They arise in complex environments, so their summaries should depict their complexity. Good graphics present the actual data and show causality, multiple comparisons, multiple perspectives, the effects of the processes that lead to their creation, or the effects of subsequent changes made to those processes. They should visually reinforce the reason the data are of significance and integrate number, word, and illustration.
Complexity is difficult to display, for obvious reasons. Still, complexity can be thoughtfully captured in sequenced layers that allow the content to unfold, inviting interpretation and developing their meaning in the process. A rich and insightful discussion of graphic excellence can be found in any of the works by Edward Tufte (http://www.edwardtufte.com).
Data are generated either by counts or by measurements. Count data arise in settings where sampled elements are classified by a key attribute and assigned to categories, such as defective versus acceptable products, commodities whose prices rose as opposed to remained the same or even fell, or the number of sales made using credit rather than debit card, check, or cash. These are qualitative, or categorical, variables and their summaries are handled differently from quantitative variables. Quantitative variables capture data arising from measurements that include the dimensions of time, distance, length, volume, rates, percent, and monetary value. Some qualitative sample data can, under certain circumstances, be converted to rates or percents and can then be treated as quantitative sample data. Also, quantitative data can be assigned to intervals of measurement values that serve as categories and can be treated using methods appropriate to qualitative data.
Summarizing Qualitative Variables
Qualitative variables can answer the question: how many or how frequently? Qualitative sample data are discrete counts of elements in separate categories. Some qualitative variables may have only two categories (for example, defective or acceptable) or have multiple categories (for example, numbers of sales by geographic region within the service area).
Summarizing sample data for qualitative variables by category can be achieved in a column chart, where the horizontal axis marks each category and a column rises over the category to a level marked on the vertical axis by the count of elements in each category. Columns in the graph do not touch across the categories to convey the fact that there is not an implied continuum or even necessarily an order in which the categories are listed across the horizontal axis. See Figure 1.1 for an example of a column chart for the number of nonfatal occupational injuries and illnesses involving day(s) away from work in 2008 as reported by the U.S. Bureau of Labor Statistics.
Figure 1.1. An Example of a Column Chart: Number of Injuries and Illnesses in the United States in 2008 (http://www.bls.gov/news.release/osh2.t01.htm)
While the data reported in Figure 1.1 are accurate, they are one dimensional. They do not invite insight, comparison, or perspective. The data would be more telling if presented as an incidence rate on a common basis. See Figure 1.2.
Figure 1.2. An Example of a Column Chart: Incidence Rate of Injuries and Illnesses in the United States per 10,000 Workers in 2008 (http://www.bls.gov/news.release/osh2.t01.htm)
Even though the data reported in Figure 1.2 are also one dimensional, they are reported as a rate applied to a common basis of 10,000 workers, which invites comparison and some insight.
Sometimes data can be split on an additional dimension and summarized in a stacked or side-by-side column chart, as shown in Figures 1.3 and 1.4.
Figure 1.3. An Example of a Stacked Column Chart, Comparison of the Numbers of Injuries and Illnesses by Occupations and Industry in the United States, 2008 (http://www.bls.gov/news.release/osh2.t01.htm)
Figure 1.4. An Example of a Side-by-Side Column Chart, Comparative Incidence Rates per 10,000 Workers for Injuries and Illnesses in the United States, 2008 (http://www.bls.gov/news.release/osh2.t01.htm)
The introduction of an additional dimension leads us to recognize the low numbers of injuries and illnesses reported among workers producing goods in the state and local government sectors. Given state and local government workers are largely in the services sector, however, these results are not terribly surprising.
With the conversion of data to a comparable basis per 10,000 workers shown in Figure 1.4, the introduction of the second dimension of occupation generates an interesting insight—that workers in the goods producing occupations within local governments registered an injury/illness incidence rate
  1. 88% higher than workers in comparable occupations within state governments:
    Rate of injury/illness, goods producing, local versus state government =
    = 1.876 or approximately 88% higher in local government than state government, and
  2. 160% higher than workers in comparable occupations within private industry:
    Rate of injury/illness, goods producing, local government versus private industry =
    = 2.604 or approximately 160% higher in local government than private industry.
We can also see that workers in service-providing occupations within private industry were significantly less injury/illness prone than in either governmental sectors. The magnitudes of differences reported in the graphic give readers pause to consider a number of questions: possible differences in the nature of jobs among the different sectors, the possibility of different training for worker safety and health, the impact of potentially differing benefit packages, among others. Underscored here is an important principle of data summary: When surprises occur, say so, and present the surprising results as clearly as possible. If the magnitudes of results are unexpected, as they are in this case, say so, and present the surprising magnitudes as clearly as possible. If no data occur in an expected category, say so, and discuss what the results might mean.
A special note of caution is due in working with the data shown in Figure 1.4. Because there are comparatively fewer workers in goods producing within governmental sectors, the incidence rates across service providing and goods producing occupations are not additive within sectors. We would have to convert the rates to an average weighted by the number of workers in each occupation within each sector to achieve additive rates. The caution echoes a broader principle in graphical integrity to avoid distortion of the data.
Summarizing Quantitative Variables
Where qualitative variables respond to the question of how many, quantitative variables can answer the question of how much. Sample data generated by measurements can take on any value along a number line and are considered continuous, in comparison to count data, which are discrete because they take on only the whole number counts of sampled elements.
Comparable to using the column chart for categorical data, summary of continuous variables can be accomplished with the creation of classes into which the various sample values can be sorted and counted. The resulting frequency distribution looks very much like a column chart, where the height of each column represents the counts of data in each class. In contrast to the column chart, however, columns in a frequency distribution are contiguous to convey the fact that there is a continuous scale on the horizontal axis. See Figure 1.5 for an example of a frequency distribution for the miles per gallon (MPG) ratings for city driving for subcompact cars, model year 2011, available for new car sales in the United States.
Figure 1.5. An Example of a Frequency Distribution, U.S. City Driving MPG, Model Year 2011 (http://www.fueleconomy.gov)
Information contained in the frequency distribution can also be displayed in a line graph, where the data point for the frequency is located at the center of each class interval. This is also referred to as a frequency polygon. See Figure 1.6.
Figure 1.6. An Example of a Line Graph, U.S. City ...

Table of contents

  1. Title Page
  2. Copyright Page
  3. Contents
  4. About the Authors
  5. Chapter 1: Depicting Data in Telling Ways
  6. Chapter 2: Summarizing Location, Scatter, and Relative Position
  7. Chapter 3: Understanding the Normal Distribution and the t-Distribution
  8. Chapter 4: Using Proof by Contradiction to Draw Conclusions
  9. Chapter 5: Testing Two Population Means and Proportions
  10. Chapter 6: Analysis of Variance From Two or More Populations
  11. Chapter 7: Testing Proportions From Two or More Populations
  12. Chapter 8: Analyzing Bivariate Data
  13. Appendix: z-Table, t-Table, F Table, and Chi-Square Table
  14. Announcing the Business Expert Press Digital Library