Writings on statistics and R
Quantitative methods are what we need to move the brain and behavioral sciences (including empirical linguistics) from “science” to SCIENCE. For better or for worse, that means we need to learn statistics. The links below provide some notes for topics that I think are important yet poorly understood or lost in blissful ignorance.
A culture without numbers
The Pirahã language and culture seem to lack not only the words but also the concepts for numbers, using instead less precise terms like “small size”, “large size” and “collection”. And the Pirahã people themselves seem to be surprisingly uninterested in learning about numbers, and even actively resistant to doing so, despite the fact that in their frequent dealings with traders they have a practical need to evaluate and compare numerical expressions. A similar situation seems to obtain among some other groups in Amazonia, and a lack of indigenous words for numbers has been reported elsewhere in the world.
Many people find this hard to believe. These are simple and natural concepts, of great practical importance: how could rational people resist learning to understand and use them? I don’t know the answer. But I do know that we can investigate a strictly comparable case, equally puzzling to me, right here in the U.S. of A.
Until about a hundred years ago, our language and culture lacked the words and ideas needed to deal with the evaluation and comparison of sampled properties of groups. Even today, only a minuscule proportion of the U.S. population understands even the simplest form of these concepts and terms. Out of the roughly 300 million Americans, I doubt that as many as 500 thousand grasp these ideas to any practical extent, and 50,000 might be a better estimate. The rest of the population is surprisingly uninterested in learning, and even actively resists the intermittent attempts to teach them, despite the fact that in their frequent dealings with social and biomedical scientists they have a practical need to evaluate and compare the numerical properties of representative samples.
If we project this state of affairs onto the scale of the Pirahã society, with roughly 300 members, we arrive at something like 0.05 to 0.5 people out of 300 who understand how to count and compare quantities in ways that have become essential to the culture. In this respect, I submit, we are exactly like them.
from The Language Log
Statistics are hard. But they are useful. It’s time we started trying to make them as accessible as we make arithmetic to children!
Notes
Title | Word Count | Reading Time | Description |
---|---|---|---|
Coding Schemes for Categorical Variables | 4,014 words | 21 min | The numerical coding of categorical variables plays a major role in their interpretation. Yet, the existence of different coding schemes is rarely discussed in introductory… |
The difference between Type-I, Type-II, and Type-III tests | 197 words | 1 min | Although often not explicitly stated, there are several “types” of tests or methods of calculating sums of squares in statistics. The different types of tests test different… |
A Brief Introduction to Mixed-Effects Models | 3,012 words | 16 min | In mixed-effects models, additional variance components are introduced into the fixed-effects (traditional regression) structure. Broadly speaking, these can be thought of… |