Data dredging is ok

2/1/2024

Part of the reason for this is also because the difference between inferential and descriptive use of statistics is often blurred, and could be mistaken by novice epidemiologists. A significant part of the statistical estimate is based on the assumption that the correct statistical model is estimated. The predominant reason for this practice is the widespread notion among academics that “statistically significant data is noteworthy, and one that is not statistically significant is not”. We may use these term interchangeably in the discussion below. Data dredging is recognized by several names such as ‘fishing trip’, ‘data snooping’, ‘p-hacking’ and so on. This may lead to an exponential increase in the risk of inclusion of large quantities of false positive results, thereby corrupting the data that was meant to be originally reported. Impact of data dredging on epidemiologyĭata dredging is defined as “cherry-picking of promising findings leading to a spurious excess of statistically significant results in published or unpublished literature”.The following discussion will attempt to define data dredging and provide an answer to such questions. Another solution to the problem of data dredging is to use the Bonferroni correction.What is data dredging? How does it affect the p-value? What is its impact on the world around us?

It is now common practice to register clinical trials and specify in advance what the primary endpoints and hypotheses are to avoid the bias of data dredging. They may not be a true relationship and is spurious and any correlation found is by chance.ĭata dredging is also referred to as fishing, p-hacking, significance chasing or data snooping. If you do many and repeated statistical tests (multiple comparisons) on a data set, then some will be statistically significant by chance.

This typically happens when a data set is examined too many times with many statistical tests on the data and then only reporting or paying attention to those results that come back with statistical significance. This leads to a spurious excess of false-positive and statistically significant results. Data dredging is the cherry-picking of multiple statistical tests on a data set to demonstrate a promising or attractive finding.

0 Comments

Author

Archives

Categories

Data dredging is ok

Leave a Reply.