Great experiment, especially as we move into a world where covariate selection is more and more likely to be delegated to an LLM.
The next step is to get 15 traditional econ labs to this experiment isolation and compare heterogeneity on SD, SE, package, and covariate selection. My intuition is that they would not fare better than the agents.
Given that labs can charge upward of 100k for a QED, this will only cost a smooth 1.5m... on second thought, maybe the next step is to test with more agents
Great experiment, especially as we move into a world where covariate selection is more and more likely to be delegated to an LLM.
The next step is to get 15 traditional econ labs to this experiment isolation and compare heterogeneity on SD, SE, package, and covariate selection. My intuition is that they would not fare better than the agents.
Given that labs can charge upward of 100k for a QED, this will only cost a smooth 1.5m... on second thought, maybe the next step is to test with more agents
True. That’s an interesting counterfactual I hadn’t thought about, but probably that human be AI variation at scale is a great next step.