Multivariate Density Estimation
eBook - ePub

Multivariate Density Estimation

Theory, Practice, and Visualization

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Multivariate Density Estimation

Theory, Practice, and Visualization

Book details
Book preview
Table of contents
Citations

About This Book

Clarifies modern data analysis through nonparametric density estimation for a complete working knowledge of the theory and methods

Featuring a thoroughly revised presentation, Multivariate Density Estimation: Theory, Practice, and Visualization, Second Edition maintains an intuitive approach to the underlying methodology and supporting theory of density estimation. Including new material and updated research in each chapter, the Second Edition presents additional clarification of theoretical opportunities, new algorithms, and up-to-date coverage of the unique challenges presented in the field of data analysis.

The new edition focuses on the various density estimation techniques and methods that can be used in the field of big data. Defining optimal nonparametric estimators, the Second Edition demonstrates the density estimation tools to use when dealing with various multivariate structures in univariate, bivariate, trivariate, and quadrivariate data analysis. Continuing to illustrate the major concepts in the context of the classical histogram, Multivariate Density Estimation: Theory, Practice, and Visualization, Second Edition also features:

  • Over 150 updated figures to clarify theoretical results and to show analyses of real data sets
  • An updated presentation of graphic visualization using computer software such as R
  • A clear discussion of selections of important research during the past decade, including mixture estimation, robust parametric modeling algorithms, and clustering
  • More than 130 problems to help readers reinforce the main concepts and ideas presented
  • Boxed theorems and results allowing easy identification of crucial ideas
  • Figures in color in the digital versions of the book
  • A website with related data sets

Multivariate Density Estimation: Theory, Practice, and Visualization, Second Edition is an ideal reference for theoretical and applied statisticians, practicing engineers, as well as readers interested in the theoretical aspects of nonparametric estimation and the application of these methods to multivariate data. The Second Edition is also useful as a textbook for introductory courses in kernel statistics, smoothing, advanced computational statistics, and general forms of statistical distributions.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Multivariate Density Estimation by David W. Scott in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Wiley
Year
2015
ISBN
9781118575536
Edition
2

1
REPRESENTATION AND GEOMETRY OF MULTIVARIATE DATA

A complete analysis of multidimensional data requires the application of an array of statistical tools—parametric, nonparametric, and graphical. Parametric analysis is the most powerful. Nonparametric analysis is the most flexible. And graphical analysis provides the vehicle for discovering the unexpected.
This chapter introduces some graphical tools for visualizing structure in multidimensional data. One set of tools focuses on depicting the data points themselves, while another set of tools relies on displaying of functions estimated from those points. Visualization and contouring of functions in more than two dimensions is introduced. Some mathematical aspects of the geometry of higher dimensions are reviewed. These results have consequences for nonparametric data analysis.

1.1 INTRODUCTION

Classical linear multivariate statistical models rely primarily on analysis of the covariance matrix. So powerful are these techniques that analysis is almost routine for datasets with hundreds of variables. While the theoretical basis of parametric models lies with the multivariate normal density, these models are applied in practice to many kinds of data. Parametric studies provide neat inferential summaries and parsimonious representation of the data.
For many problems second-order information is inadequate. Advanced modeling or simple variable transformations may provide a solution. When no simple parametric model is forthcoming, many researchers have opted for fully “unparametric” methods that may be loosely collected under the heading of exploratory data analysis. Such analyses are highly graphical; but in a complex non-normal setting, a graph may provide a more concise representation than a parametric model, because a parametric model of adequate complexity may involve hundreds of parameters.
There are some significant differences between parametric and nonparametric modeling. The focus on optimality in parametric modeling does not translate well to the nonparametric world. For example, the histogram might be proved to be an inadmissible estimator, but that theoretical fact should not be taken to suggest histograms should not be used. Quite to the contrary, some methods that are theoretically superior are almost never used in practice. The reason is that the ordering of algorithms is not absolute, but is dependent not only on the unknown density but also on the sample size. Thus the histogram is generally superior for small samples regardless of its asymptotic properties. The exploratory school is at the other extreme, rejecting probabilistic models, whose existence provides the framework for defining optimality.
In this book, an intermediate point of view is adopted regarding statistical efficacy. No nonparametric estimate is considered wrong; only different components of the solution are emphasized. Much effort will be devoted to the data-based calibration problem, but nonparametric estimates can be reasonably calibrated in practice without too much difficulty. The “curse of optimality” might suggest that this is an illogical point of view. However, if the notion that optimality is all important is adopted, then the focus becomes matching the theoretical properties of an estimator to the assumed properties of the density function. Is it a gross inefficiency to use a procedure that requires only two continuous derivatives when the curve in fact has six continuous derivatives? This attitude may have some formal basis but should be discouraged as too heavy-handed for nonparametric thinking. A more relaxed attitude is required. Furthermore, many “optimal” nonparametric procedures are unstable in a manner that slightly inefficient procedures are not. In practice, when faced with the application of a procedure that requires six derivatives, or some other assumption that cannot be proved in practice, it is more important to be able to recognize the signs of estimator failure than to worry too much about assumptions. Detecting failure at the level of a discontinuous fourth derivative is a bit extreme, but certainly the effects of simple discontinuities should be well understood. Thus only for the purposes of illustration are the best assumptions given.
The notions of efficiency and admissibility are related to the choice of a criterion, which can only imperfectly measure the...

Table of contents

  1. COVER
  2. TITLE PAGE
  3. TABLE OF CONTENTS
  4. PREFACE TO SECOND EDITION
  5. PREFACE TO FIRST EDITION
  6. 1 REPRESENTATION AND GEOMETRY OF MULTIVARIATE DATA
  7. 2 NONPARAMETRIC ESTIMATION CRITERIA
  8. 3 HISTOGRAMS: THEORY AND PRACTICE
  9. 4 FREQUENCY POLYGONS
  10. 5 AVERAGED SHIFTED HISTOGRAMS
  11. 6 KERNEL DENSITY ESTIMATORS
  12. 7 THE CURSE OF DIMENSIONALITY AND DIMENSION REDUCTION
  13. 8 NONPARAMETRIC REGRESSION AND ADDITIVE MODELS
  14. 9 OTHER APPLICATIONS
  15. APPENDIX A: COMPUTER GRAPHICS IN ℜ3
  16. APPENDIX B: DATASETS
  17. APPENDIX C: NOTATION AND ABBREVIATIONS
  18. REFERENCES
  19. AUTHOR INDEX
  20. SUBJECT INDEX
  21. WILEY SERIES IN PROBABILITY AND STATISTICS
  22. END USER LICENSE AGREEMENT