eBook - ePub

Data Analysis with IBM SPSS Statistics

Name: Data Analysis with IBM SPSS Statistics
Author: Kenneth Stehlik-Barry, Anthony J. Babinec

Kenneth Stehlik-Barry,

Anthony J. Babinec,

446 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Data Analysis with IBM SPSS Statistics

Kenneth Stehlik-Barry,

Anthony J. Babinec,

Book details

Book preview

Table of contents

Citations

About This Book

Master data management & analysis techniques with IBM SPSS Statistics 24About This Book• Leverage the power of IBM SPSS Statistics to perform efficient statistical analysis of your data• Choose the right statistical technique to analyze different types of data and build efficient models from your data with ease• Overcome any hurdle that you might come across while learning the different SPSS Statistics concepts with clear instructions, tips and tricksWho This Book Is ForThis book is designed for analysts and researchers who need to work with data to discover meaningful patterns but do not have the time (or inclination) to become programmers. We assume a foundational understanding of statistics such as one would learn in a basic course or two on statistical techniques and methods. What You Will Learn• Install and set up SPSS to create a working environment for analytics• Techniques for exploring data visually and statistically, assessing data quality and addressing issues related to missing data• How to import different kinds of data and work with it• Organize data for analytical purposes (create new data elements, sampling, weighting, subsetting, and restructure your data)• Discover basic relationships among data elements (bivariate data patterns, differences in means, correlations)• Explore multivariate relationships• Leverage the offerings to draw accurate insights from your research, and benefit your decision-makingIn DetailSPSS Statistics is a software package used for logical batched and non-batched statistical analysis. Analytical tools such as SPSS can readily provide even a novice user with an overwhelming amount of information and a broad range of options for analyzing patterns in the data.The journey starts with installing and configuring SPSS Statistics for first use and exploring the data to understand its potential (as well as its limitations). Use the right statistical analysis technique such as regression, classification and more, and analyze your data in the best possible manner. Work with graphs and charts to visualize your findings. With this information in hand, the discovery of patterns within the data can be undertaken. Finally, the high level objective of developing predictive models that can be applied to other situations will be addressed.By the end of this book, you will have a firm understanding of the various statistical analysis techniques offered by SPSS Statistics, and be able to master its use for data analysis with ease.Style and approachProvides a practical orientation to understanding a set of data and examining the key relationships among the data elements. Shows useful visualizations to enhance understanding and interpretation. Outlines a roadmap that focuses the process so decision regarding how to proceed can be made easily.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes, you can access Data Analysis with IBM SPSS Statistics by Kenneth Stehlik-Barry, Anthony J. Babinec in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2017

ISBN

9781787280700

Edition

Topic

Computer Science

Subtopic

Data Processing

Index

Computer Science

Principal Components and Factor Analysis

The SPSS Statistics FACTOR procedure provides a comprehensive procedure for doing principal components analysis and factor analysis. The underlying computations for these two techniques are similar, which is why SPSS Statistics bundles them in the same procedure. However, they are sufficiently distinct, so you should consider what your research goals are and choose the appropriate method for your goals.

Principal components analysis (PCA) finds weighted combinations of the original variables that account for the total variance in the original variables. The first principal component finds the linear combination of variables that accounts for as much variance as possible. The second principal component finds the linear combination of variables that accounts for as much of the remaining variance as possible, and also has the property that it is orthogonal (independent) to the first component, and so on.

PCA is employed as a dimension reduction technique. Your data might contain a large number of correlated variables, and it can be a challenge to understand the patterns and relationships among them. While there are as many components as there are original variables in the analysis, you can often account for a sufficient fraction of the total variance in the original variables using a smaller set of principal components.

Factor analysis (FA) finds one or more common factors--that is, latent variables (variables that are not directly observed)--that account for the correlations between the observed variables. There are necessarily fewer factors than variables in the analysis. Typically, the researcher employs factor rotation to aid interpretation.

Both of these techniques are exploratory techniques. The researcher is often unsure at the outset of the analysis what number of components or factors might be adequate or right. The SPSS Statistics FACTOR program offers statistics and plots both for assessing the suitability of the data for analysis as well as for assessing the quality of the tentative PCA or FA solution.

This chapter covers the following topics:

Choosing between PCA and FA
Description of PCA example data
SPSS Code for initial PCA analysis of example data
Assessing factorability of the data
Principal components analysis--two-component run
Description of factor analysis example data
The reduced correlation matrix and its eigenvalues
Factor analysis code
Factor analysis results

Choosing between principal components analysis and factor analysis

How does FA differ from PCA? Overall, as indicated in the chapter introduction, PCA accounts for the total variance of the variables in terms of the linear combinations of the original variables, while FA accounts for the correlations of the observed variables by positing latent factors. Here are some contrasts on how you would approach the respective analyses in SPSS Statistics FACTOR.

You can employ PCA on either covariances or correlations. Likewise, you can employ FA on either covariances (for extraction methods PAF or IMAGE) or correlations. The analysis in this chapter analyzes correlation matrices because correlations implicitly put variables on a common scale, and that is often needed for the data with which we work.

Following are a few of the important parameters in the discussion of PCA and FA:

Regarding methods: If you wish to run PCA, there is one method--PCA. If you wish to run factor analysis, the most commonly used methods are principal axis factoring (PAF) and maximum likelihood (ML). Other methods are available in FACTOR, and you should consult a textbook or the following references for more information. Because the default method is PCA, you must explicitly specify a factor method if you intend to do factor analysis and not principal components analysis.
Regarding communality estimates: The communality, or common variance, of a variable is the amount of variance that is shared among a set of variables that can be explained by a set of common factors. The goal of PCA is to explain the total variance among the set of variables, or at least some fraction of it, while the goal of FA is to explain the common variance among a set of variables.
As indicated in the following PCA example, the initial communality estimates for the variables are ones, while the final communality estimates depends on the order of the solution. If you specify as many components as there are factors, then the final communalities are one, while if you specify fewer components than there are factors, the final communalities are typically less than one.
In FA, SPSS Statistics FACTOR supplies initial communality estimates automatically. Typically, these are squared multiple correlations when the variable in question is regressed on the rest of the variables in the analysis. Final communalities are a byproduct of analysis elements, such as the extraction method used and the specified number of factors.
Regarding the number of components or factors: In PCA, you can extract as many components as there are variables, while in FA, the number of factors is necessarily less than the number of variables. In FA, but not PCA, the closeness of the reproduced correlations (off the main diagonal) to the observed correlations guides the choice of the number of factors to retain.
Regarding rotation: Principal components have the geometric interpretation of being uncorrelated directions of maximum variations in the data. This interpretation holds only for the unrotated component loadings, so if you do perform rotation on component loadings, you lose this interpretation.
In the case of factor analysis, rotation is often done to aid interpretability. Ideally, after rotation, you can identify sets of variables that go together, as indicated by high loadings on a given factor. Presumably, these sets represent variables that are correlated more with each other than with the rest of the variables.
SPSS Statistics FACTOR provides a number of popular orthogonal and oblique rotation methods. Orthogonal rotations lead to uncorrelated factors, but there is no reason to think a priori that the factors are uncorrelated, so in general, you should consider oblique rotation methods which allow factors to correlate.
Regarding scores: When using PCA, you can compute component scores, and when using FA, you can compute factor scores. When computing component scores, you are mathematically projecting the observations into the space of components. We demonstrate this in the following PCA example. When computing factor scores, technical literature draws attention to the problem of factor indeterminacy (see the Mulaik reference for a discussion). For this reason, some researchers caution against computing and using factor scores.

Often, the reason the analyst computes factor scores is to use the derived variable as either a predictor or a target in an analysis. In this case, you might avoid computing factor scores altogether, and instead consider the framework of Structural Equation Models (not covered in this book).

This chapter focuses on using SPSS Statistics FACTOR for PCA and FA. For background, and more information on these methods, here are two recommended books.

Here is a readable modern treatment: Fabrigar, Lea...

Title Page
Copyright
Credits
About the Authors
Acknowledgement
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
Installing and Configuring SPSS
Accessing and Organizing Data
Statistics for Individual Data Elements
Dealing with Missing Data and Outliers
Visually Exploring the Data
Sampling, Subsetting, and Weighting
Creating New Data Elements
Adding and Matching Files
Aggregating and Restructuring Data
Crosstabulation Patterns for Categorical Data
Comparing Means and ANOVA
Correlations
Linear Regression
Principal Components and Factor Analysis
Clustering
Discriminant Analysis