1. How Many Replicators Does It Take to Achieve Reliability? Investigating Researcher Variability in a Crowdsourced Replication
- Author
-
Nate Breznau, Eike Mark Rinke, Alexander Wuttke, Hung Hoang Viet Nguyen, Muna Adem, Jule Adriaans, Esra Akdeniz, Amalia Alvarez-Benjumea, Henrik Kenneth Andersen, Daniel Auer, Flavio Azevedo, Oke Bahnsen, Ling Bai, Dave Balzer, Gerrit Bauer, Paul Bauer, Markus Baumann, Sharon Baute, Verena Benoit, Julian Bernauer, Carl Berning, Anna Berthold, Felix S. Bethke, Thomas Biegert, Katharina Blinzler, Johannes Blumenberg, Licia Bobzien, Andrea Bohman, Thijs Bol, Amie Bostic, Zuzanna Brzozowska, Katharina Burgdorf, Kaspar Burger, Kathrin Busch, Juan Carlos Castillo, Nathan Chan, Pablo Christmann, Roxanne Connelly, Christian S. Czymara, Elena Damian, Eline Adriane de Rooij, Alejandro Ecker, Achim Edelmann, Christina Eder, Maureen A. Eger, Simon Ellerbrock, Anna Forke, Andrea Gabriele Forster, Danilo Freire, Chris Gaasendam, Konstantin Gavras, Vernon Gayle, Theresa Gessler, Timo Gnambs, Amélie Godefroidt, Max Grömping, Martin Groß, Stefan Gruber, Tobias Gummer, Andreas Hadjar, Verena Halbherr, Jan Paul Heisig, Sebastian Hellmeier, Stefanie Heyne, Magdalena Hirsch, Mikael Hjerm, Oshrat Hochman, Jan H. Höffler, Andreas Hövermann, Sophia Hunger, Christian Hunkler, Nora Huth, Zsofia Ignacz, Sabine Israel, Laura Jacobs, Jannes Jacobsen, Bastian Jaeger, Sebastian Jungkunz, Nils Jungmann, Jennifer Kanjana, Mathias Kauff, Salman Khan, Sayak Khatua, Manuel Kleinert, Julia Klinger, Jan-Philipp Kolb, Marta Kolczynska, John Seungmin Kuk, Katharina Kunißen, Dafina Kurti Sinatra, Alexander Greinert, Robin C. Lee, Philipp M. Lersch, David Liu, Lea-Maria Löbel, Philipp Lutscher, Matthias Mader, Joan Eliel Madia, Natalia Malancu, Luis Maldonado, Helge Marahrens, Nicole Martin, Paul Martinez, Jochen Mayerl, OSCAR Jose MAYORGA, Robert Myles McDonnell, Patricia A. McManus, Kyle Wagner, Cecil Meeusen, Daniel Meierrieks, Jonathan Mellon, Friedolin Merhout, Samuel Merk, Daniel Meyer, Leticia Micheli, Jonathan J.B. Mijs, Cristóbal Moya, Marcel Neunhoeffer, Daniel Nüst, Olav Nygård, Fabian Ochsenfeld, Gunnar Otte, Anna Pechenkina, Mark Pickup, Christopher Prosser, Louis Raes, Kevin Ralston, Miguel Ramos, Frank Reichert, Arne Roets, Jonathan Rogers, Guido Ropers, Robin Samuel, Gergor Sand, Constanza Sanhueza Petrarca, Ariela Schachter, Merlin Schaeffer, David Schieferdecker, Elmar Schlueter, Katja Schmidt, Regine Schmidt, Alexander Schmidt-Catran, Claudia Schmiedeberg, Jürgen Schneider, Martijn Schoonvelde, Julia Schulte-Cloos, Sandy Schumann, Reinhard Schunck, Juergen Schupp, Julian Seuring, Henning Silber, Willem W. A. Sleegers, Nico Sonntag, Alexander Staudt, Nadia Steiber, Nils Steiner, Sebastian Sternberg, Dieter Stiers, Dragana Stojmenovska, Nora Storz, Erich Striessnig, Anne-Kathrin Stroppe, Jordan Suchow, Janna Teltemann, Andrey Tibajev, Brian B. Tung, Giacomo Vagni, Jasper Van Assche, Meta van der Linden, Jolanda van der Noll, Arno Van Hootegem, Stefan Vogtenhuber, Bogdan Voicu, Fieke Wagemans, Nadja Wehl, Hannah Werner, Brenton M. Wiernik, Fabian Winter, Christof Wolf, Cary Wu, Yuki Yamada, Björn Zakula, Nan Zhang, Conrad Ziller, Stefan Zins, and Tomasz Żółtak
- Subjects
Ecological validity ,media_common.quotation_subject ,SocArXiv|Social and Behavioral Sciences|Sociology|Methodology ,Behavioural sciences ,SocArXiv|Social and Behavioral Sciences|Political Science ,bepress|Social and Behavioral Sciences|Sociology|Civic and Community Engagement ,Data science ,bepress|Social and Behavioral Sciences|Political Science ,bepress|Social and Behavioral Sciences|Sociology ,SocArXiv|Social and Behavioral Sciences|Sociology ,Workflow ,bepress|Social and Behavioral Sciences|Sociology|Quantitative, Qualitative, Comparative, and Historical Methodologies ,Replication (statistics) ,bepress|Social and Behavioral Sciences ,Code (cryptography) ,Key (cryptography) ,Quality (business) ,SocArXiv|Social and Behavioral Sciences ,Sociology ,SocArXiv|Social and Behavioral Sciences|Sociology|Political Sociology ,Reliability (statistics) ,media_common - Abstract
This paper reports findings from a crowdsourced replication. Eighty-five independent teams attempted a computational replication of results reported in an original study of policy preferences and immigration by fitting the same statistical models to the same data. The replication involved an experimental condition. Random assignment put participating teams into either the transparent group that received the original study and code, or the opaque group receiving only a methods section, rough results description and no code. The transparent group mostly verified the numerical results of the original study with the same sign and p-value threshold (95.7%), while the opaque group had less success (89.3%). Exact numerical reproductions to the second decimal place were far less common (76.9% and 48.1%), and the number of teams who verified at least 95% of all effects in all models they ran was 79.5% and 65.2% respectively. Therefore, the reliability we quantify depends on how reliability is defined, but most definitions suggest it would take a minimum of three independent replications to achieve reliability. Qualitative investigation of the teams’ workflows reveals many causes of error including mistakes and procedural variations. Although minor error across researchers is not surprising, we show this occurs where it is least expected in the case of computational reproduction. Even when we curate the results to boost ecological validity, the error remains large enough to undermine reliability between researchers to some extent. The presence of inter-researcher variability may explain some of the current “reliability crisis” in the social sciences because it may be undetected in all forms of research involving data analysis. The obvious implication of our study is more transparency. Broader implications are that researcher variability adds an additional meta-source of error that may not derive from conscious measurement or modeling decisions, and that replications cannot alone resolve this type of uncertainty.
- Published
- 2021