Global Strategy - blogThis is the blog section of Glostra website
Jan
25
2012
Quantitative management research: torture or interrogation?Published in research, institutions, academia by Jukka LuomaIn recent times, there has been a lot of discussion about the questionable practices of scholars, universities and publishers. In a related fashion, the economist Ronald Coase once said that "if you torture the data long enough, Nature will confess." He referred to a common research practice of flexibly changing one's model, collecting more data and using different measures until you find interesting and publishable results; it is likely that at some point you will find statistically significant results purely by chance. For a qualitative researcher, flexibility is a good thing. In fact, going back and forth between data and theory is the primary mode of doing qualitative research. However, in quantitative research, flexibility is somewhat counter-intuitively considered a bad thing. Let me explain.
It is a common convention in research that when identifying relationships between variables (e.g., employee turnover and profitability), one has to report the likelihood that the relationship found in a single study is due to chance. When there is only a small (typically, five percent) probability that the relationship is the product of chance it is called statistically significant. If your results are statistically significant, according to conventions, you have then provided empirical evidence that supports your theoretical argument. The problem is that the reliability of statistical significance itself depends on the analysis process. As shown in a recent article in Psychological Science, and quoted by the strategyprofs.net blog:
(You can see the original article for a more thorough discussion, but basically the reason is that if there is a 5 percent chance that your results are wrong and you test two models and pick the one which supports your claims better then there is a 9.75 percent chance that you are wrong.)
By being "flexible" in the data collection and analysis process, the statistical significance of the results is biased. This makes it difficult for the reader to assess the plausibility of the empirical evidence provided by the researcher to support his or her argument. This is why flexibility is bad in quantitative research. The aforementioned article in the Psychological Science suggests that researchers should aim to remove flexibility from the data collection and analysis and disclose whatever flexibility is left in the process. This allows the reader to better assess how reliable the claims of statistical significance truly are. Similar concerns about the validity of management research were raised by William Starbuck in his entertaining book The Production of Knowledge.
However, there is another twist which often goes unnoticed. The foregoing discussion illustrates what I think is quite common: Researchers usually worry about statistical significance (e.g., the probability of finding a correlation between two variables when there really is no correlation) rather than what is called statistical power (e.g., failure to detect a correlation when there actually is one). The latter is also often quite likely and very serious. Consider for example the fact that management and strategy researchers typically measure things with error rather than perfectly. Even when doing simple correlations, measurement error increases the chances of finding non-significant relationships between variables which actually correlate.
Many real-life situations are such that the error of omitting an effect may be just as bad as erroneously attributing one. In technical terms, statistical power may be as important as statistical significance. Suppose that we want to find out whether carbon dioxide emissions induce global warming. Surely, it is important to have some degree of certainty that the two variables are related (statistical significance). However, failure to recognize that they are related (lack of statistical power) may be an even greater risk. Likewise, we may assume that managers of business organizations are equally worried about the risk of not knowing the potential consequences of their actions--as they are worried about the risk of doing something which will not have an effect.
Therefore, I do not think there is an easy answer to the problem raised by Professor Coase. While I agree that transparency and ethical conduct in research are good things, the scientific community and individual researchers face a trade-off between the risk of not finding results (false negatives) and the risk of finding results that are products of randomness (false positives). A glimpse of hope is offered by the fact that even if individual papers report false positives, replication should correct this over time. Of course, this depends on the editorial policies of journals. Unfortunately, the top management journals are not big fans of replication, encouraging researchers seek novel insights rather than replication.
To conclude, I think that the validity of management research depends not only on individual-level practices but also on journals, universities, publishers, scholarly search and citation indexing services and other institutions. Trackback(0)
Comments (2)
![]()
... written by Mikko Ketokivi, January 27, 2012, 11:05
Hi -- you do a pretty good job of describing bad quantitative research. For balance, one might discuss good quantitative research -- plenty of that out there, too. In my experience, rigorous peer-review processes do a pretty good job of quality control these days.
It's important to separate the method from its misuse -- statistical inference is one of the most important inferential tools in management research. For the record, I am a quantitative researcher and do not consider going back and forth between data and theory a bad thing, indeed I often recommend it. To clarify, it is equally important not to equate statistical inference with the hypothetico-deductive method. The former is an inferential tool, the latter, a research design. The latter, by definition, emphasizes theorizing before empirical analysis. Just like the grounded theory approach is not synonymous with qualitative research. Your lamentations over statistical power are familiar, but can you think of an authentic example from management research where lack of statistical power led to dismissal of a theoretically (or practically) relevant statistical association? I cannot. My experience is that with the typical sample sizes management researchers use, statistical insignificance is indicative of the association being so weak it does not warrant our attention. This of course does not mean we should dismiss statistical power, just to be more realistic about exactly how big of a problem the lack of it really is. I'm not a big fan of basing arguments on hypotheticals. Unfortunately, it seems to me that the bulk of the critique regarding lack of power is based on such hypotheticals. But all this is highly context dependent. In other fields of inquiry, lack of statistical power may be of much greater concern. Whenever people's lives are at stake, both type I and type II errors must be taken much more seriously. Bad quantitative (or qualitative) research is not going to disappear any time soon. The best course of action, I think, is to pay no attention to it. I don't think bad research warrants critical appraisal. Regards, Mikko Ketokivi Write comment
|
Latest Blog Entry
Blog CategoriesBlogger
Latest CommentsBlog Archive
Tags
|



I had been reading some comments--in both psychology and management--that all the research out there is highly unreliable because of the flexible practices employed in statistical analyses (references are in the original post). I was a bit frustrated about the one-sidedness of the discussion. Statistical power, I think, is just one counter-argument why flexibility might not be a bad thing. I think our conclusions are not that different. Perhaps my too negative tone masked the point.
I do not know of any examples of the kind of false negatives you refer to, although I think it would be fairly difficult to identify them. In fact, I think they are eliminated partly because researchers use statistical analyses iteratively.
Finally, I also think that peer-review is important. However, replicating, building on, refuting and extending other people's work is another mechanism affecting quality of research. I think that the overall process and practices and institutions therein, in addition to the practices employed in single studies, are important factors affecting where research is going and how valid it is.