- 512 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
About This Book
This is the third, newly revised and extended edition of this successful book (that has already been translated into three languages). Like the previous editions, it is entirely based on the programming language and environment R and is still thoroughly hands-on (with thousands of lines of heavily annotated code for all computations and plots). However, this edition has been updated based on many workshops/bootcamps taught by the author all over the world for the past few years: This edition has been didactically streamlined with regard to its exposition, it adds two new chapters – one on mixed-effects modeling, one on classification and regression trees as well as random forests – plus it features new discussion of curvature, orthogonal and other contrasts, interactions, collinearity, the effects and emmeans packages, autocorrelation/runs, some more bits on programming, writing statistical functions, and simulations, and many practical tips based on 10 years of teaching with these materials.
Frequently asked questions
Information
1 Some fundamentals of empirical research
1.1 Introduction
- ‒ it has been written especially for linguists: there are many many introductions to statistics for psychologists, economists, biologists etc., but much fewer which, like this one, explain statistical concepts and methods on the basis of linguistic questions and for linguists, and it does so starting from scratch;
- ‒ it explains how to do many of the statistical methods both ‘by hand’ as well as with statistical functions (and sometimes simulations), but it requires neither mathematical expertise nor hours of trying to understand complex equations – many introductions devote much time to mathematical foundations (and while knowing some of the math doesn’t hurt, it makes everything more difficult for the novice), others do not explain any foundations and immediately dive into some nicely designed software which often hides the logic of statistical tests behind a nice GUI;
- ‒ it not only explains statistical concepts, tests, and graphs, but also the design of tables to store and analyze data and some very basic aspects of experimental design;
- ‒ it only uses open source software: many introductions use in particular SPSS or MATLAB (although in linguistics, those days seem nearly over, thankfully), which come with many disadvantages such that (i) users must buy expensive licenses that might be restricted in how many functions they offer, how many data points they can handle, how long they can be used, and/ or how quickly bugs are fixed; (ii) students and professors may be able to use the software only on campus; (iii) they are at the mercy of the software company with regard to often really slow bugfixes and updates etc. – with R, I have written quite a few emails with bug reports and they were often fixed within a day!
- ‒ while it provides a great deal of information – much of it resulting from years of teaching this kind of material, reviewing, and fighting recalcitrant data and reviewers – it does that in an accessible and (occasionally pretty) informal way: I try to avoid jargon wherever possible and some of what you read below is probably too close to how I say things during a workshop – this book is not exactly an exercise in formal writing and may reflect more of my style than you care to know. But, as a certain political figure once said in 2020, “it is what it is …” On the more unambiguously positive side of things, the use of software will be illustrated in very much detail (both in terms of amount of code you’re getting and the amount of very detailed commentary it comes with) and the book has grown so much in part because the text is now answering many questions and anticipating many errors in thinking I’ve encountered in bootcamps/classes over the last 10 years; the RMarkdown document I wrote this book in returned a ≈560 page PDF. In addition and as before, there are ‘think breaks’ (and the occasional warning), exercises (with answer keys on the companion website; over time, I am planning on adding to the exercises), and of course recommendations for further reading to dive into more details than I can provide here.
- ‒ ask questions about statistics relevant to this edition (and hopefully also get an answer from someone);
- ‒ send suggestions for extensions and/ or improvements or data for additional exercises;
- ‒ inform me and other readers of the book about bugs you find (and of course receive such information from other readers). This also means that if R commands, or code, provided in the book differs from information provided on the website, then the latter is most likely going to be correct.
Table of contents
- Title Page
- Copyright
- Contents
- Statistics for Linguistics with R – Endorsements of the 3rd Edition
- Introduction
- 1 Some fundamentals of empirical research
- 2 Fundamentals of R
- 3 Descriptive statistics
- 4 Monofactorial tests
- 5 Fixed-effects regression modeling
- 6 Mixed-effects regression modeling
- 7 Tree-based approaches
- About the Author