1. Incorrect results in software engineering experiments: How to improve research practices
- Author
-
Magne Jrgensen, Tore Dyb, Knut Liestl, and Dag I.K. Sjberg
- Subjects
Computer science ,business.industry ,05 social sciences ,020207 software engineering ,02 engineering and technology ,Publication bias ,050105 experimental psychology ,Statistical power ,Hardware and Architecture ,Statistics ,0202 electrical engineering, electronic engineering, information engineering ,0501 psychology and cognitive sciences ,Software engineering ,business ,Software ,Information Systems ,Statistical hypothesis testing - Abstract
Publication and researcher bias is common in software engineering experiments.Our model shows how these biases lead to a high proportion of incorrect results.Increased statistical power is a key factor to improve the trustworthiness. ContextThe trustworthiness of research results is a growing concern in many empirical disciplines. AimThe goals of this paper are to assess how much the trustworthiness of results reported in software engineering experiments is affected by researcher and publication bias, given typical statistical power and significance levels, and to suggest improved research practices. MethodFirst, we conducted a small-scale survey to document the presence of researcher and publication biases in software engineering experiments. Then, we built a model that estimates the proportion of correct results for different levels of researcher and publication bias. A review of 150 randomly selected software engineering experiments published in the period 20022013 was conducted to provide input to the model. ResultsThe survey indicates that researcher and publication bias is quite common. This finding is supported by the observation that the actual proportion of statistically significant results reported in the reviewed papers was about twice as high as the one expected assuming no researcher and publication bias. Our models suggest a high proportion of incorrect results even with quite conservative assumptions. ConclusionResearch practices must improve to increase the trustworthiness of software engineering experiments. A key to this improvement is to avoid conducting studies with unsatisfactory low statistical power.
- Published
- 2016
- Full Text
- View/download PDF