eBook - ePub

Introduction to High-Dimensional Statistics

Name: Introduction to High-Dimensional Statistics
Author: Christophe Giraud

Christophe Giraud,

346 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Introduction to High-Dimensional Statistics

Christophe Giraud,

Book details

Book preview

Table of contents

Citations

About This Book

Praise for the first edition:

"[This book] succeeds singularly at providing a structured introduction to this active field of research. … it is arguably the most accessible overview yet published of the mathematical ideas and principles that one needs to master to enter the field of high-dimensional statistics. … recommended to anyone interested in the main results of current research in high-dimensional statistics as well as anyone interested in acquiring the core mathematical skills to enter this area of research."
— Journal of the American Statistical Association

Introduction to High-Dimensional Statistics, Second Edition preserves the philosophy of the first edition: to be a concise guide for students and researchers discovering the area and interested in the mathematics involved. The main concepts and ideas are presented in simple settings, avoiding thereby unessential technicalities. High-dimensional statistics is a fast-evolving field, and much progress has been made on a large variety of topics, providing new insights and methods. Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this new edition:

Offers revised chapters from the previous edition, with the inclusion of many additional materials on some important topics, including compress sensing, estimation with convex constraints, the slope estimator, simultaneously low-rank and row-sparse linear regression, or aggregation of a continuous set of estimators.
Introduces three new chapters on iterative algorithms, clustering, and minimax lower bounds.
Provides enhanced appendices, minimax lower-bounds mainly with the addition of the Davis-Kahan perturbation bound and of two simple versions of the Hanson-Wright concentration inequality.
Covers cutting-edge statistical methods including model selection, sparsity and the Lasso, iterative hard thresholding, aggregation, support vector machines, and learning theory.
Provides detailed exercises at the end of every chapter with collaborative solutions on a wiki site.
Illustrates concepts with simple but clear practical examples.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes, you can access Introduction to High-Dimensional Statistics by Christophe Giraud in PDF and/or ePUB format, as well as other popular books in Economics & Statistics for Business & Economics. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Chapman and Hall/CRC

Year

2021

ISBN

9781000408355

Edition

Topic

Economics

Subtopic

Statistics for Business & Economics

Index

Economics

Chapter 1 Introduction

DOI: 10.1201/9781003158745-1

1.1 High-Dimensional Data

The sustained development of technologies, data storage resources, and computing resources give rise to the production, storage, and processing of an exponentially growing volume of data. Data are ubiquitous and have a dramatic impact on almost every branch of human activities, including science, medicine, business, finance and administration. For example, wide-scale data enable to better understand the regulation mechanisms of living organisms, to create new therapies, to monitor climate and biodiversity changes, to optimize the resources in the industry and in administrations, to personalize the marketing for each individual consumer, etc.

A major characteristic of modern data is that they often record simultaneously thousands up to millions of features on each object or individual. Such data are said to be high-dimensional. Let us illustrate this characteristic with a few examples. These examples are relevant at the time of writing and may become outdated in a few years, yet we emphasize that the mathematical ideas conveyed in this book are independent of these examples and will remain relevant.

Biotech data: Recent biotechnologies enable to acquire high-dimensional data on single individuals. For example, DNA microarrays measure the transcription level¹ of tens of thousands of genes simultaneously; see Figure 1.1. Next generation sequencing (NGS) devices improve on these microarrays by allowing to sense the “transcription level” of virtually any part of the genome. Similarly, in proteomics some technologies can gauge the abundance of thousands of proteins simultaneously. These data are crucial for investigating biological regulation mechanisms and creating new drugs. In such biotech data, the number p of “variables” that are sensed scales in thousands and is most of the time much larger than the number n of “individuals” involved in the experiment (number of repetitions, rarely exceeding a few hundreds).

¹ The transcription level of a gene in a cell at a given time corresponds to the quantity of ARNm associated to this gene present at this time in the cell.

Figure 1.1 *Whole human genome microarray covering more than 41,000 human genes and transcripts on a standard* 1″ × 3″ *glass slide format. © Agilent Technologies, Inc. 2004. Re-produced with permission, courtesy ofAgilent Technologies, Inc.*

Images (and v...

Cover
Half Title
Title Page
Copyright Page
Table of Contents
Preface, second edition
Preface
Acknowledgments
1 Introduction
2 Model Selection
3 Minimax Lower Bounds
4 Aggregation of Estimators
5 Convex Criteria
6 Iterative Algorithms
7. Estimator Selection
8. Multivariate Regression
9. Graphical Models
10 Multiple Testing
11 Supervised Classification
12 Clustering
Appendix A Gaussian Distribution
Appendix B Probabilistic Inequalities
Appendix C Linear Algebra
Appendix D Subdifferentials of Convex Functions
Appendix E Reproducing Kernel Hilbert Spaces
Notations
References
Index