Back in July, Julia Belluz, Brad Plumer, and Brian Resnick wrote up this beautiful thing, which you need to go read right now. And I said at the time that I would reply. And then I got lazy. But then so this is me doing that, now. In 7 parts, structured as their big beautiful article is.
Rough Draft as of Nov 06. Will be updating with links and more careful thought on an irregular basis.
Part II: The Incentives Problem
The problem is parallel to what Bill Barnwell railed against in football for years at Grantland – that incentives are set up to favor good outcomes, rather than good processes, which perversely affects real outcomes.
P-hacking is the most egregious example. I think the key here is “they’re not always doing it consciously.” Most researchers are not operating in bad faith. But they are responding to incentives.
I think this quote is also key: “the scientist is in charge of evaluating the hypothesis, but the scientist also desperately wants the hypothesis to be true.” But of course the objective of science is to chase down objective truth (I’m leaving philosophical issues off the table here; there’s a reason I dropped off the philosophy track undergrad, thank you Rorty et al.).
So this is pretty awesome: “An estimated $200 billion … is routinely wasted on poorly designed and redundant studies.” Well so how much of this is an incentives issue? Because redundancy should actually get punished by the current system – that sounds to me more like an information overload problem (i.e. editors can’t be expected to catch every redundant study, especially in more broadly defined fields). This ties into another conversation which is, how many PhDs is too many? As I cover in another post.
Before reading the authors’ fixes, I would identify 3 culprits. First, funders – we want to see results! We want to know that our money was well spent. How can we do that if the results turn out negative? Boards don’t like seeing that you spent millions on research just to see that the policy advice is… do nothing. Second, journals – at core, you submit what you hope will get published, so this is kind of on the editors. But this requires a deeper dive into why the editors are biased toward positive results – doubt it’s about bad faith, so what’s up here? Finally, tenure – this is such a huge one. I don’t know how you change this but the tenure reward process is so supremely fucked up. It doesn’t reward anything outside of high level journal publications, basically, and we’ve seen how those are shit. God forbid you want to contribute to policy, or practice.
Process Over Outcome –Per Simon Vazire: “”Grants, publications, jobs, awards, and even media coverage should be based more on how good the study design and methods were, rather than whether the result was significant or surprising.”” Well, so, yes… This is my process over outcomes point. But let’s be honest about media/communications efforts here. It is a complete non-story (maybe even the definition of one!) that a study had no significant results, but was conducted in a responsible way. It’s just completely unexciting. But researchers (tail excepted) don’t get rewarded for media coverage so I don’t know that we actually care that much about it. Science can do the slow difficult grind without reporting – it is not critically important to research’s success. As for the rest of the system needing to reward good process, more below…
I love Tim Gowers’ perspective — you only get credit for what you publish, not the informal sharing of ideas or anything like that. I mean, this is a longer conversation, but the current system was developed in a time of high transportation costs and high costs of information flows. Now we have cheap flights and the internet. We’re basically still completely organized around the conference & journal system, and collaborations within your own university (and frequently within your own department). It’s totally insane that we have retained journal articles as (basically) the only measurable outcome. But saying it’s broke is not enough. What is the fix here? [Note to self: Cue another article…]
Transparency! — Yes, transparency is great too: having to register your hypothesis ex ante, so that if it turns out to be false you’re accountable and can’t p-hack your way around it. How to implement – that ought to be pretty simple, we would think, because you just have it as a funding requirement.
But this ties into the broader point I’ve been making about science since State of the Field, which despite being a horrific failure, was still I think correct in the fundamental idea (if not the execution). (Example page of what we were trying to do, content all by academics in the field). We need a live-updated forum of scientific ideas, which gets you the transparency above, and also hopefully cuts off some of the “race to be first.” It also helps with a) junior researchers figuring out where best to contribute, b) journalists having a readily available consensus of what’s going on, c) funders having a clear idea of the landscape and where the next marginal dollar will be most effective, and d) interdisciplinary researchers having an easier time crossing boundaries (which we know yields big, novel advances). I think this also helps with the tenure problem.
Gowers is completely right, but seriously, what else should departments be looking at? Well, you need new metrics. So if everyone is on one “scientific platform” that is an actual physical (well, web) presence, you can pretty easily set up metrics around the rate at which a researcher is coming up with and testing new ideas, and maybe even a peer-evaluated idea of how valuable that work is (and hey compare against the average or median in that field, whatever, you can borrow ideas from sports here maybe).
Because this is sort of key. A researcher’s value is a function of the speed of the work multiplied by its relative importance in getting done – REGARDLESS of whether the results end up being significant, we have to say ex ante that “yes Dr. Researcher, we as a scientific community think that testing that idea is valuable, and all we care about is whether you test it well, not which direction the result lands.” Let me finish with that sports metaphor, because it’s actually pretty funny. It’s like we’re still paying/rewarding based on homeruns or batting average instead of WAR or On-Base-Percentage or something. Only problem is who’s going to be Billy Beane and upset the system, because this isn’t a competitive landscape in that kind of way. I don’t know, but it’s hilarious and terrible that baseball is ahead of science on metrics.*
Anyway I think I owe myself another article on what this kind of revamped system would look like, from whom you need institutional buy-in, who pays for it, etc. That’s a big beast of a problem, but I’m not sure there’s any single lever you can pull to fix it — kind of need a bunch of disparate actors to act at once, collectively.
*Yadda yadda it’s far easier to do in baseball, since it’s all contained in a well-defined field and everything is recorded. But we put in the effort to set up 3d-tracking cameras in every stadium, and I’m going to go out on a limb and say long-run GDP is more a function of scientific efficiency than, uh, how fast Kris Bryant rounds bases.**
**I don’t even really watch baseball, so I am pushing this metaphor pretty thin.