A Gentle Introduction to Support Vector Machines in Biomedicine
eBook - ePub

A Gentle Introduction to Support Vector Machines in Biomedicine

Volume 2: Case Studies and Benchmarks

  1. 212 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

A Gentle Introduction to Support Vector Machines in Biomedicine

Volume 2: Case Studies and Benchmarks

Book details
Book preview
Table of contents
Citations

About This Book

Support Vector Machines (SVMs) are among the most important recent developments in pattern recognition and statistical machine learning. They have found a great range of applications in various fields including biology and medicine. However, biomedical researchers often experience difficulties grasping both the theory and applications of these important methods because of lack of technical background. The purpose of this book is to introduce SVMs and their extensions and allow biomedical researchers to understand and apply them in real-life research in a very easy manner. The book is to consist of two volumes: theory and methods (Volume 1) and case studies (Volume 2).

Contents:

  • Preliminaries:
    • Introduction and Book Overview
    • Methods Used in this Book
  • Case Studies and Comparative Evaluation in High-Throughput Genomic Data:
    • Application and Comparison of SVMs and Other Methods for Multicategory Microarray-Based Cancer Classification
    • Comparison of SVMs and Random Forests for Microarray-Based Cancer Classification
    • Comparison of SVMs and Kernel Ridge Regression for Microarray-Based Cancer Classification (Contributed by Zhiguo Li)
    • Application and Comparison of SVMs and Other Methods for Multicategory Classification in Microbiomics (Contributed by Mikael Henaff, Kranti Konganti, Varun Narendra, Alexander V Alekseyenko)
    • Application to Assessment of Plasma Proteome Stability
  • Case Studies and Comparative Evaluation in Text Data:
    • Application and Comparison of SVMs and Other Methods for Retrieving High-Quality Content-Specific Articles (Contributed by Yindalon Aphinyanaphongs)
    • Application and Comparison of SVMs and Other Methods for Identifying Unproven Cancer Treatments on the Web (Contributed by Yindalon Aphinyanaphongs)
    • Application to Predicting Future Article Citations (Contributed by Lawrence Fu)
    • Application to Classifying Instrumentality of Article Citations (Contributed by Lawrence Fu)
    • Application and Comparison of SVMs and Other Methods for Identifying Drug–Drug Interactions-Related Literature (Contributed by Stephany Duda)
  • Case Studies with Clinical Data:
    • Application to Predicting Clinical Laboratory Values
    • Application to Modeling Clinical Judgment and Guideline Compliance in the Diagnosis of Melanoma (Contributed by Andrea Sboner)
  • Other Comparative Evaluation Studies of Broad Applicability:
    • Using SVMs for Causal Variable Selection
    • Application and Comparison of SVM-RFE and GLL Methods


Readership: Biomedical researchers and healthcare professionals who would like to learn about SVMs and relevant bioinformatics tools but do not have the necessary technical background.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access A Gentle Introduction to Support Vector Machines in Biomedicine by Alexander Statnikov, Constantin F Aliferis, Douglas P Hardin, Edited by in PDF and/or ePUB format, as well as other popular books in Medicine & Biotechnology in Medicine. We have over one million books available in our catalogue for you to explore.

Information

Publisher
WSPC
Year
2013
ISBN
9789814518505

PART I

Preliminaries

CHAPTER 1

Introduction and Book Overview


The first volume of the present two-volume book introduced the essential principles of support vector machines (SVMs) and machine learning in general. SVMs are a powerful modern machine learning methodology that has found great success in a variety of applications, including biomedicine. The emphasis of the first volume was to make SVM principles, which are often inaccessible to biomedical researchers due to being technically quite demanding, easy to grasp even for an audience that normally lacks substantial computational and mathematical training.
The first volume presented essential SVM principles, algorithms and protocols, but did not elaborate on all the necessary details of how the formal methods can be applied in practical settings. The first volume also did not present empirical comparisons of SVMs with other stateof-the-art methods which could reasonably be considered as alternatives in modern biomedical research. These two areas are the focus of the present, second, volume.
It is our intent that together the two volumes will provide sufficient theoretical and practical depth and guidance to data analysts and modelers so that they can bring the power of SVMs to bear successfully in their data analysis and modeling needs.
The present volume will also be useful to researchers that are quantitatively sophisticated and do not need the “gentle” introduction of Volume 1, but can still benefit from guidance about effective ways to translate theoretical SVM methods to real-life and demanding data analytic practice.

Organization of the Second Volume

The second volume provides a summary of the main methods used in this book (Chapter 2), including essential SVM theory that was covered in depth in the first volume. This is to provide a refresher of the core concepts needed to understand the material in the present volume and also to make the second volume sufficiently self-contained, so that it can be read or taught independently of the first volume to appropriate audiences.
The remaining material comprises case studies and benchmarks.
  • Case studies aim to give the reader a wide enough range of application areas and a deep enough account of practical details on how to translate the theoretical methods of the first volume into successful applied modeling of academic and industry relevance.
  • Benchmarks are systematic comparisons of SVM-based methods to other state-of-the-art methods that can be reasonably considered as alternatives to the same types of analyses that SVMs are designed for.
The case studies and benchmarks are organized into four parts corresponding to genomic data, text data, clinical data, and broad (data-independent) categories.
We remain firm in our commitment, stated in the first volume, that we do not wish to impart to the readers a false sense that SVMs are a “one solution fits all problems” data analysis and modeling paradigm, so in these benchmarks we examine both strengths and limitations of SVMs. Our benchmarks uncover for the benefit of the practitioner analyst several important strengths and weaknesses of SVMs.
The SVM applications and benchmarking literature is vast and fairly rich. Instead of drawing from this general literature, we elected to present here case studies and benchmarks in which we were directly involved. This is to ensure that we have a degree of familiarity with these applications and benchmarks that go well beyond what can be obtained from reading published reports of work done by others. The latter, by necessity provides a limited view of what it takes to create successful applications, including choice of options/parameters, design of analysis approach, and numerous other details that often do not make it to the peer review literature but are essential for the success of an applied project. We hope that our extensive collective experience with SVM applications and comparative testing provides to the reader a sufficiently wide view of what SVMs can accomplish in practice without constraining the reader’s imagination about all possible opportunities which are, literally, endless.
The format of the second volume is no longer that of “programmed text” employed in the first volume. This format was needed in the first volume to cover technically complex material in manageable chunks for the benefit of technically unsophisticated readers. Since the programmed text format is no longer needed here, it is replaced by the more appropriate traditional exposition format.
It is necessary to state that as is the case with most modern, high-performance machine learning and pattern recognition technologies, the methods and processes presented in Volumes 1 and 2 are to the best of our knowledge unconstrained for academic use, but many of the presented technologies as well as applications are protected by copyrights, patents and pending patents which entail the need for necessary licenses to be obtained for commercial applications from the owners of the Intellectual Property. The number of patents covering SVMs is very large and constantly expanding, and it is outside the scope of our work to identify which SVM methods presented in this book (and outside it) are owned by whom and for which application domain. We leave it to the readers interested in commercial applications to work with qualified technical and legal consultants to make sure that commercial applications of SVM methods are properly licensed. For the applications specifically presented in Volume 2, licensing inquiries can be made to: Discovery Holdings LLC, http://www.discoveryholdings.net.
Individual chapters in Volume 2 discuss essential information about the goals of the projects; what were the options/parameters and how the data analysis and modeling plan was formulated; the broader context of the projects; practical decisions that were useful; and main lessons learned.
In the remainder, we provide a synopsis of all case studies and benchmarks presented in the present volume.
Chapter 3 (“Application and Comparison of SVMs and Other Methods for Multicategory Microarray-Based Cancer Classification”) shows how SVMs can be used to build diagnostic classifiers for 41 cancer types and 12 normal tissue types using microarray gene expression data. This is a very important application area that is already creating the foundations for the next generation diagnostics and personalized medicine of the future. Simultaneously with the application details, the chapter shows that SVMs outperform major classification methods; that relatively simple gene selection methods can improve classification accuracy; that some particular SVM multiclass methods are preferable over others; and that ensembling does not improve classification accuracy of the best non-ensemble models. The findings and ideas of the study have been used to create a robust automodeller, GEMS (http://www.gems-system.org), which has been tested very successfully with rigorous standards of validation (independent, prospective data validation) and was shown to match or exceed the predictive performance of human experts in all published models associated with the employed datasets.
Chapter 4 (“Comparison of SVMs and Random Forests for Microarray-Based Cancer Classification”) performs a similar but more extended comparison of SVMs with the very popular method of Random Forests (RFs). RFs are used extensively in bioinformatics and have been popular primarily because they combine the intuitive and proven ideas of decision tree induction and bagging. The comparison involves 22 datasets with diagnostic and prognostic response variables. SVMs outperform RFs both in the settings when no gene selection is performed and when several popular gene selection methods are used.
Chapter 5 (“Comparison of SVMs and Kernel Ridge Regr...

Table of contents

  1. Cover
  2. SubTitle
  3. Title
  4. Copyrights
  5. Contents
  6. Part 1
  7. Part 2
  8. Part 3
  9. Part 4
  10. Part 5
  11. Conclusions and Lessons Learned
  12. Biblography
  13. Index