Feb 12

Digging into why a $11 LLM replication of a PNAS paper works at all

3 Comments

You may be interested in this paper in Political Analysis by some RAND researchers: https://www.cambridge.org/core/journals/political-analysis/article/stay-tuned-improving-sentiment-analysis-and-stance-detection-using-large-language-models/2D8F121012D3D1CB2259B6DD5EE32D0D

They find that using both in-target tuning and some extra prompting strategies improves classification by a good margin.

Dr Sam Illingworth

Thanks for sharing this so publicly, Scott. I'm really enjoying following along and also learning how to use this approach in my own research as well.

What processes are you using for recording the steps that you've taken for quality assurance as a human? As I imagine, that would be really interesting and, in some instances, necessary to report if this was ever to be published as peer-reviewed research.

Ralf Elsas-Nicolle

Hi Scott. Super interesting experiment and blog!

I‘m currently engaged in a similar endeavor with forum posts. But I‘m running all the time into token restrictions when trying to use cheaper batch processing, though in Google Cloud. OpenAI is supposedly even more restricted in this regard. To understand your workflow and data - when you say 300k documents - are these chunks of speeches? What is the overall token amount of the speech data?

Thanks - and looking forward to read your continued blog tomorrow.

Claude Code 18: When the Reclassification Is…