Northwestern University, Evanston, USA
The Replicability of Scientific Findings Using Human and Machine Intelligence
In top journals, more papers fail than pass replication tests and papers failing replications spread as widely as replicating papers. This dynamic raises research costs by over 20bn annually, jeopardizes the literature, and exposes the need for new methods for predicting replicability. Using 96 studies that underwent rigorous manual replication, we developed an artificial intelligence (AI) model that predicts a paper’s replicability. We then tested the model on 317 diverse out-of-sample studies that span disciplines, methods, and topics. We find that AI predicts replicability better than statistics and individual reviewers and as accurately as prediction markets, the gold standard of replicability methods. Further, AI generalizes to out-of-sample data at AUC levels up to 0.78. Finally, tests indicate that the AI model does not show biases common to human reviewers. We discuss how AI can address replication problems at scale in ways that current methods cannot and can advance research by combining human and machine intelligence.