38 Comments
User's avatar
Dr Sam Illingworth's avatar

Thanks Scott for this excellent take on the continued proliferation of the publish or perish dilemma and the extent to which AI is exasperating this.

As a chief executive editor on a couple of journals, as well as an associate editor and peer reviewer on another dozen or so, the increase in submissions has exploded over the past two and a half years. The quality is also getting better, as you would expect, as these models develop; however, something needs to be done at an ideological level to stop the deluge.

I know it seems extreme, but I wonder what would happen if researchers were limited to one publication per year and instead spent the rest of their time focusing on impact and the extent to which their research could genuinely benefit society. I'm being purposely provocative here but would welcome your thoughts.

scott cunningham's avatar

I doubt that the solution is a quantity restriction on researcher output. That isn't advisable from a scientific perspective, nor is it enforceable, nor does it benefit anyone involved. I'm not sure that a solution is absolutely necessary yet. These will need to run their course.

CPR-Film's avatar

This is one of the clearest takes I've seen on AI-driven paper production. Thanks for the thoughtful write-up

One more shift I think is coming is that our institutional and individual quality markers are mostly proxy metrics built for scarcity. In econ, tenure and hiring at top departments load heavily on “Top Five” placements (Heckman & Moktan document how dominant that marker became, although I admittedly don't have on-the-ground experience in an econ department). More broadly, academia leans on counts/citations/journal prestige because committees can’t read everything.

If agentic tools make submission-quality output much cheaper, those proxies inflate and get less informative (classic metric-targeting / Goodhart dynamics). So the bottleneck you describe also stresses promotion/hiring evaluation, too. That’s basically your “more polished papers → harder to rank” point, but at the level of career evaluation. When lots of people can plausibly present a thick CV or a fatter pipeline, the signal shifts to things that are harder to mass-produce or fake. The legacy markers were built for a world where writing a competent paper was itself the costly signal. Once that cost collapses, institutions either (a) drown in inflated markers, or (b) re-anchor evaluation on scarcer signals.

Interesting times.

scott cunningham's avatar

Thank you for saying that! I polished this essay a lot more than my usual posts, and sat on it for several days.

The impact on hiring, tenure and promotion is for sure a big thing coming too. There’s a tremendous uneven amount of knowledge and interest in AI already, and I suspect even less so with Claude code and agents in general, across faculty . Those individuals make policy, and govern their departments, though there is a hierarchy too (eg deans). It’s hard to see the evolution of norms combined with what I bet is very sticky preferences. And some I would think to at least attempt to move ahead are encouraged to do the sorts of things I described even though the gains are unclear, and the externalities on others unclear too through downward pressure on acceptance rates and taxing referee pools.

I frame all this as fan fiction, but that’s mainly a hedge. I think this isn’t a crazy scenario given what we’ve seen with book volume, and I think a global effort to hack various algorithms like the h-index through volume, and various gray market efforts to pump up citation numbers through various channels, not to mention scandals involving data fabrication and p hacking. The “professionalization of science”, call it, is what I bet is the limiting factor. The more this is a business, the more elastic.

I do wonder what this does to traditional markers of quality like impact factor and journal brand (eg top 5). I note the decline in acceptance rates alone could make publishing more valued if it’s a function of that. But that’s a simplistic partial equilibrium analysis. I am not sure what to think at that level longer term as you noted.

CPR-Film's avatar

In a world where these tools are being marketed in biblical terms, (1000 PhDs in your pocket) the hedged fan-fiction approach is a breath of fresh air.

Your observation that pattern-matching shortcuts (pedigree, affiliation, recognition) will become an even more important index as today’s very sticky metrics slowly lose utility likely applies here as well. Under the volume + polish dynamics you’re describing, it’s the lowest-friction heuristic left when everything looks submission-ready at first brush.

The irony is that the existing indices are meant to reward rigor regardless of origin.

Either way, I loved reading this, and I’m definitely sharing it with a few early-career friends who are trying to min-max their way to an academic career.

Richard Devine's avatar

Many prestigious authors already seem to publish more regardless of quality so it seems reasonable that the AI advantages will enable them to benefit further.

I get your point that some papers will slip through, but how good is the paper that you spent an hour or so developing? I imagine to make it quite good you'd need to spend some time rewriting it and polishing it over the course of weeks and months, right?

Lastly, I'd be curious for you to video and share the prompts you used for your process of automating the creation of a paper in an hour or so.

scott cunningham's avatar

Good idea. I'll do that. My prompting is barely reproducible if at all since I just ramble and overwrite my request and trust Claude understands me since LLMs extract all the right signal and ignore the noise from reasonably large texts. But I'll do a video of me doing it.

Kirian Mischke-Reeds's avatar

This was a fascinating (if bleak) read. Scott, with AI output flooding the zone in terms of quantity (short-term) and quality (medium to long-term), at what point is the constraint on progress in the field our ability to understand it?

scott cunningham's avatar

I honestly can't figure out a coherent narrative on that question. It feels like it's the speed with which everything keeps changing that makes near term predictions hard to form. I would've never thought Claude Code or AI agents generally would mean what they mean. It wasn't even a possibility for me that this was what was predicted when I first heard about agents. And so now it keeps feeling like we're circling a drain faster and I'm not sure where it'll spit any of us out.

Alexander Kustov's avatar

Good post. But is it really the case that refine.ink is or always will be superior to an elaborate prompt/skill in Claude Code / Codex? At least from my experience so far, I was able to get a rather similar result from both systems.

scott cunningham's avatar

I would never bet against the big companies. One day they come up with some seemingly inconsequential update and refine.ink is gone. And their economies of scale allows them to compete on price, so I suspect to your question -- no, I don't think any secondary downstream AI boutique service is going to make it.

Joseph Francis's avatar

You can get a very similar service for $7 at isitcredible.com.

Pawel Jozefiak's avatar

The prisoner dilemma framing is the honest version of the Claude Code productivity story that rarely gets told. Individual benefits are real; collective adoption collapses the system everyone is optimizing for.

What is striking about the academic publishing case is that the arms race dynamic plays out faster there than elsewhere because the feedback loop is so clear - you can measure submission volumes and acceptance rates. The same pattern runs through any workflow Claude Code touches: it changes what you attempt, not just how fast you attempt it.

That shift in ambition is where the real instruction design challenge lives. After 1000+ sessions daily, what actually survives in CLAUDE.md looks very different from what most guides recommend: https://thoughts.jock.pl/p/how-i-structure-claude-md-after-1000-sessions

John Severini's avatar

Very much agree, but do you not see the journal model itself falling away? Presumably at some point we will come up with some sort of automated system to separate the wheat from the chaff. And could imagine that such a system might lead us into an era of a sort of "research factory" model. What that might look like in practice is hard to say, but it certainly wouldn't be the journal model.

scott cunningham's avatar

I honestly don’t think so. It probably does go away if human researchers are no longer in possession of the comparative advantage of doing research, but if we stay involved, I bet journals do too. Though honestly, what do I know. Six months ago I didn’t think Claude code was a possibility.

John Severini's avatar

Fair point. Path dependency in academia has historically been really be difficult to overcome.

Vlad Tarko's avatar

If the revenues of journals increase, they will be able to start paying the human reviewers. (European Economic Review is already doing it, not sure when they started.)

scott cunningham's avatar

Yeah but it’s not like you can resolve that too — what’s to stop human reviewers from pocketing that money and paying refine.ink for the review?

Vlad Tarko's avatar

Honor system? :) Presumably, the authors and editors will use refine before sending the papers to reviewers, so the reviewers will have to deliver some extra value.

Owen Lewis's avatar

What does this look like when the whole process of scientific discovery gets automated? I'm thinking the "hard" sciences, where for some things at least the entire process can be run almost without human intervention.

scott cunningham's avatar

If the entire enterprise can be more accurately done at lower cost, then the only obvious reason it wouldn’t be automated is bc of Luddite style “machine breaking” behavior by those who stand to lose the most from pure automation. And that’s a bit of a more challenging situation bc science is allegedly for all of humanity for all of time, and not just those workers who produce it.

Andeas Ortmann's avatar

Great piece, Scott. Just forwarded it to my co-editors at EE. It's not clear to me how much experimental economics will be affected by it but we are -- this year so far -- way up in terms of submissions. (Could be those the result of a couple of SIs that we have in the works, not sure.)

Kamran Soomro's avatar

If LLMs are writing the papers and LLMs are reading the papers, what even is the point of publishing? Who are the publications for?

scott cunningham's avatar

LLMs are submitting papers, not necessarily publishing them. That's the point of the post -- producing papers and publishing papers are not two distinct things. They always were, but it's sort of a potentially larger wedge since at minimum you could not publish something you had not first written, but now in principle you can -- if not today, then probably very soon. Even if not at the AER.

The point of all this, though, is the gains of science come through some production function, which involves verification for accuracy. That's always been there and will always be there. And I suspect we will always need as humans some markers of signaled quality (now more than ever) to just trust that the results are accurate. It's why we have meta-analyses for instance. There's just so much there that needs summarizing.

The question is who has the comparative advantage in all parts of the pipeline, not who needs the output of that pipeline. Humans needs the scientific output of the pipeline. That will always be the case. But my post is saying that the pipeline *might* be strained by the ability to produce at scale fully automated papers. Published work, or some equivalent of it, still gives humans the ability to discern truth, but that isn't the same thing as saying humans are required for it. It's no longer the craziest thing for me anyway to imagine a near future where humans no longer have a comparative advantage in research tasks they always have had it in. In which case, the way that work will be done and verified -- honestly, who knows. Keep in mind this is fan fiction, though.

Kamran Soomro's avatar

I get all that but one of the possible scenarios is that LLMs review the submissions to help scale up. In that scenario what even is the point of publishing anymore?

Perhaps this is my delusion but given the extreme looming uncertainty, my coping mechanism has been to delve even more deeply into my field and try and become even more of an expert in it.

scott cunningham's avatar

Yeah, I mean you control what you can control. It's not like any of us have ever had a massive amount of control over our stuff getting published in the first place, too. You focus on what you want to focus on, what you can control, what you can influence. I am designed to do the things I do because I love how it makes me feel. And who knows -- this could remain dystopian fiction. There are bottlenecks I note that do make me think 3-d printing papers and submitting could get identified (where is someone hiding those 75 papers?) in which case who knows -- maybe it'll be discouraged. IT does impose costs on others, if only via the noise it injects into the system, plus they're likely low quality, and that too is not cool. Be a researcher. I think continuing to invest in skills and our own selves is always the right answer, and frankly, I think for me anyway investing in the things that bring me joy, which is my own knowledge, growth and pursuits.

Alejandro Lopez-Lira's avatar

Post made me rush the draft posting haha

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6337880

scott cunningham's avatar

lol -- great minds think alike! I'll be sure to post about this paper!

Tjeerd Boonstra's avatar

Same applies to grant applications..

JP's avatar

That $3,200 annual cost figure is wild when you flip it. I did the maths on Pro and Max plans versus what Anthropic charges through the API and the gap is almost comical. You're looking at 10-13x the value of what you actually pay. Wrote it up here: https://reading.sh/why-your-expensive-claude-subscription-is-actually-a-steal-02f10893940c?sk=65a39127cbd10532ba642181ba41fb8a

scott cunningham's avatar

whoa. I'm going to read that closely, but that was not what I was expecting you to find.

JP's avatar

yeah, even a 20x max sub is a tasting plate lol

Fulvio Castellacci's avatar

Great post, thanks Scott! Among other things, another possible consequence of what you describe is that oral presentation skills might progressively become a more important way to assess a researcher's capabilities than her/his written research production. For instance, imagine a peer review process in which the submitting author will have to present the work for the Editors and reviewers, just like in a seminar.

William's avatar

I’m not an academic, but I’m curious if the paper mill slop problem could be managed if the submission fees were not fixed. For example, the cost for each additional submission could be double the previous one (and then have the fee reset to a baseline amount at the start of the year).

scott cunningham's avatar

That’s a pretty straightforward and not bad idea actually. I bet rising prices like that would be done bc the externality is causing rising marginal costs as the 5th paper is less costly than the 6th and so on to editors and referees. So yeah, I could easily see that and that would like have at least some queuing.