Handy statistical lexicon
By Andrew Gelman on May 24, 2009 10:29 PM 3 Comments
These are all important methods and concepts related to statistics that are not as well known as they should be. I hope that by giving them names, we will make the ideas more accessible to people:
Mister P: Multilevel regression and poststratification.
The Secret Weapon: Fitting a statistical model repeatedly on several different datasets and then displaying all these estimates together.
The Superplot: Line plot of estimates in an interaction, with circles showing group sizes and a line showing the regression of the aggregate averages.
The Folk Theorem: When you have computational problems, often there's a problem with your model.
The Pinch-Hitter Syndrome: People whose job it is to do just one thing are not always so good at that one thing.
Weakly Informative Priors: What you should be doing when you think you want to use noninformative priors.
P-values and U-values: They're different.
Conservatism: In statistics, the desire to use methods that have been used before.
WWJD: What I think of when I'm stuck on an applied statistics problem.
Theoretical and Applied Statisticians, how to tell them apart: A theoretical statistician calls the data x, an applied statistician says y.
The Fallacy of the One-Sided Bet: Pascal's wager, lottery tickets, and the rest.
Alabama First: Howard Wainer's term for the common error of plotting in alphabetical order rather than based on some more informative variable.
The USA Today Fallacy: Counting all states (or countries) equally, forgetting that many more people live in larger jurisdictions, and so you're ignoring millions and millions of Californians if you give their state the same space you give Montana and Delaware.
Second-Order Availability Bias: Generalizing from correlations you see in your personal experience to correlations in the population.
The "All Else Equal" Fallacy: Assuming that everything else is held constant, even when it's not gonna be.
The Self-Cleaning Oven: A good package should contain the means of its own testing.
The Taxonomy of Confusion: What to do when you're stuck.
The Blessing of Dimensionality: It's good to have more data, even if you label this additional information as "dimensions" rather than "data points."
Scaffolding: Understanding your model by comparing it to related models.
Ockhamite Tendencies: The irritating habit of trying to get other people to use oversimplified models.
Bayesian: A statistician who uses Bayesian inference for all problems even when it is inappropriate. I am a Bayesian statistician myself.
Multiple Comparisons: Generally not an issue if you're doing things right but can be a big problem if you sloppily model hierarchical structures non-hierarchically.
Taking a model too seriously: Really just another way of not taking it seriously at all.
God is in every leaf of every tree: No problem is too small or too trivial if we really do something about it.
I know there are a bunch I'm forgetting; can youall refresh my memory, please? Thanks.
P.S. No, I don't think I can ever match Stephen Senn in the definitions game.
marcel May 25, 2009 4:54 PM Reply
In WWJD, you say, "My quick answer is, Yeah, I think it would be excellent for an econometrics class if the students have applied interests. Probably I'd just go through chapter 10 (regression, logistic regression, glm, causal inference), with the later parts being optimal."
So just skip the earlier parts?
Andrew Gelman May 25, 2009 7:23 PM Reply
Marcel: When I say "through chapter 10," I mean, "from chapters 1 through 10." And in the last sentence above, I meant "optional," not "optimal." I'll fix that.
jonathan May 26, 2009 11:22 AM Reply
Mister P, huh? Isn't that reflective of the old male dominant paradigm?
From no-data to data: The awkward transition - I was going to write a post with the above title, but now I don’t remember what I was going to say! The post From no-data to data: The awkward transition...