A Primer in Biological Data Analysis and Visualization Using R
eBook - ePub

A Primer in Biological Data Analysis and Visualization Using R

Gregg Hartvigsen

Buch teilen
  1. English
  2. ePUB (handyfreundlich)
  3. Über iOS und Android verfĂŒgbar
eBook - ePub

A Primer in Biological Data Analysis and Visualization Using R

Gregg Hartvigsen

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

R is the most widely used open-source statistical and programming environment for the analysis and visualization of biological data. Drawing on Gregg Hartvigsen's extensive experience teaching biostatistics and modeling biological systems, this text is an engaging, practical, and lab-oriented introduction to R for students in the life sciences.

Underscoring the importance of R and RStudio in organizing, computing, and visualizing biological statistics and data, Hartvigsen guides readers through the processes of entering data into R, working with data in R, and using R to visualize data using histograms, boxplots, barplots, scatterplots, and other common graph types. He covers testing data for normality, defining and identifying outliers, and working with non-normal data. Students are introduced to common one- and two-sample tests as well as one- and two-way analysis of variance (ANOVA), correlation, and linear and nonlinear regression analyses. This volume also includes a section on advanced procedures and a chapter introducing algorithms and the art of programming using R.

HĂ€ufig gestellte Fragen

Wie kann ich mein Abo kĂŒndigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kĂŒndigen“ – ganz einfach. Nachdem du gekĂŒndigt hast, bleibt deine Mitgliedschaft fĂŒr den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich BĂŒcher herunterladen?
Derzeit stehen all unsere auf MobilgerĂ€te reagierenden ePub-BĂŒcher zum Download ĂŒber die App zur VerfĂŒgung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die ĂŒbrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den AboplÀnen?
Mit beiden AboplÀnen erhÀltst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst fĂŒr LehrbĂŒcher, bei dem du fĂŒr weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhĂ€ltst. Mit ĂŒber 1 Million BĂŒchern zu ĂŒber 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
UnterstĂŒtzt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nÀchsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist A Primer in Biological Data Analysis and Visualization Using R als Online-PDF/ePub verfĂŒgbar?
Ja, du hast Zugang zu A Primer in Biological Data Analysis and Visualization Using R von Gregg Hartvigsen im PDF- und/oder ePub-Format sowie zu anderen beliebten BĂŒchern aus Scienze biologiche & Biologia. Aus unserem Katalog stehen dir ĂŒber 1 Million BĂŒcher zur VerfĂŒgung.

Information

Jahr
2014
ISBN
9780231537049
CHAPTER 1
INTRODUCING OUR SOFTWARE TEAM
In science we are interested in understanding systems that are complicated. Our use of quantitative approaches gives us the ability to not only understand these systems but also to predict how a system might behave in the future (or maybe even how it behaved in the past). As we work to understand and predict complex biological systems we need computational help. You probably have written lab reports using only a calculator. This should be avoided for a variety of important reasons:
1. Difficulty in verifying that you entered the data correctly. (I think the numbers are right.)
2. Difficulty in repeating the analysis. (I’m not doing it again because I might get a different answer!)
3. Inability to share your analytical approaches and results. (Sorry, I hit the all-clear button! You have to trust me.)
4. Inflexibility in how the data are analyzed. (You wanted me to do what?).
5. Inability to make and share appropriate graphs. (Can I take a picture of the graph on my calculator with my phone and incorporate that in my lab report?)
To solve these shortcomings we will use Excel and R.
You may be somewhat familiar with Excel but probably have little or no experience with R. Therefore, I welcome you to the world of R! I know this might be a scary place for you at first. I bet R is really different from all the programs you’ve used. Fortunately, this introduction is intended for newcomers. But as you proceed you will learn how to do some really amazing things with R. You’ll gain independence with practice. R is like playing an instrument, a sport, or learning a foreign language—they all require practice. I have confidence that you are capable of using R to solve interesting problems. And the more time you spend at it the better you will get.
1.1 SOLVING PROBLEMS WITH EXCEL AND R
For many analytical problems we will be able to use just R. However, in biology, we often test our ideas, or hypotheses, with large amounts of data. We, therefore, will try to use Excel for what it does well (allows us to enter and organize our data). But we will not use Excel to do what it doesn’t do well (statistical analyses, modeling, and visualizing data). Instead, these core scientific skills are best done with R. If you love Excel then you’ll be happy to know we’re not abandoning it—Excel has its place.
It is important to recognize that doing things well is rarely easy. Writing a good poem, playing tennis well, or doing ballet well are all hard. And conducting hypothesis tests correctly and making professional-quality graphs are not simple, one-click operations.
At first you will likely think that making graphs and performing statistical tests in R are absolute nightmares. (And when you become a skilled R programmer you’ll still be challenged at times!) But the days of skipping an analysis or accepting a ungly or incorrect graph because “that’s the best I can do with Excel” are over. You can do it in R! Therefore, in this introduction we will discuss Excel but focus mainly on R. It is the combination of using Excel to organize our data and R for analyses and visualizations that will allow you to ask and answer questions in biology.
You still may be wondering why you can’t just do this all in Excel. Here is a sampling of reasons why R is clearly better than Excel for problem solving in biology. With R you can:
1. create professional, publication-quality visualizations;
2. conduct quantitative analyses, both analytical and statistical (e.g., do a t-test, solve systems of differential equations, conduct non-linear regression, use matrix algebra, conduct signal processing, perform wavelet analysis, analyze fMRI data, do genome analyses, and create phylogenetic reconstructions, to name a few);
3. build statistical tests that can be repeated easily and shared with anyone. These tests might rely on their own data, data read from a file, or data acquired directly from a website;
4. do the same thing and work the same way on computers running Mac, Windows, and Linux;
5. write computer programs, such as modeling a population growing over time, using an object-oriented language;
6. access modern analytical tools for biologists that are being developed right now, right here, and no where else;
7. use and receive widely available help from the R open-source community;
8. use open-source software that provides solutions that are “auditable,” meaning you can understand and explain to others how you got your results (there are no black boxes - it’s open software!);
9. write a document like this. This environment allows one to compile together in one document words, mathematical equations, computer code, statistical tests and output, and professional-quality graphs, all within the free, open-source LATEX typesetting environment;
10. carry a research project, paper, all the data, AND carry the entire software package for doing the analysis on a low-capacity flash drive;
11. rest assured that your investment in skill building will pay off well into the future. You don’t have to hope you’ll have access to the program when you move on to your next stage of life (which could be in a hospital in Ghana!);
12. enjoy these benefits because open-source means R is free!
Your ability to use R to make informed, evidence-based conclusions likely will provide you the most valuable set of skills you’ll learn as an undergraduate science major. If you keep this skill set you will be highly marketable. R helps you speak the language of science, which is written in mathematics, statistics, and data evaluation and visualization. This ability to answer scientific questions and present your results professionally is finally in your hands.
Your ability to use R helps fulfill an important goal that was synthesized in the report Scientific Foundations for Future Physicians produced by the American Association of American Medical Colleges and the Howard Hughes Medical Institute, 2009. The authors of this report downplay the importance of memorizing facts and, instead, encourage students to learn to
apply quantitative reasoning and appropriate mathematics to describe or explain phenomena in the natural world.
Additionally, in the report Vision and Change in Undergraduate Biology: A Call to Action, produced jointly by the American Association for the Advancement of Science and the National Science Foundation (2009), six “core competencies” are advocated for undergraduates in biology. Below are four of th...

Inhaltsverzeichnis