Talk:Data dredging

This is the talk page for discussing improvements to the Data dredging article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Psychology

	Psychology portal This article is within the scope of WikiProject Psychology, a collaborative effort to improve the coverage of Psychology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.PsychologyWikipedia:WikiProject PsychologyTemplate:WikiProject Psychologypsychology articles
???	This article has not yet received a rating on the project's importance scale.

Statistics Mid‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Mid	This article has been rated as Mid-importance on the importance scale.

Daily pageviews of this article

A graph should have been displayed here but graphs are temporarily disabled. Until they are enabled again, visit the interactive graph at pageviews.wmcloud.org

Easing the verbiage of the Introduction[edit]

I think that the intro defines the most egregious case of data dredging, and not data dredging in general.

"The process of data dredging involves automatically testing huge numbers of hypotheses about a single data set by exhaustively searching"

This is not necessarily true. If I test 4 hypotheses about a single data set, only one turns up significant and I report only that one hypothesis, then I have committed p-hacking/data-dredging. This is an important distinction because this involved neither huge numbers of hypotheses nor an exhaustive search.

I'm going to let this sit for a day, and if nobody has objections, I will implement the changes. Ihearthonduras (talk) 18:45, 24 January 2018 (UTC)[reply]

I agree with Ihearthonduras - it's more than just automated testing. This happens just as much by hand. Lionfish0 (talk) 08:44, 4 February 2019 (UTC)[reply]

I tried changing this but it was reverted. The reason given was 'according to whom' but the whole section is unreferenced, so it's better it's correct and unreferenced than wrong and unreferenced. Here's a reference that might do: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002106 — Preceding unsigned comment added by Lionfish0 (talk • contribs) 09:24, 5 February 2019 (UTC)[reply]

Merge[edit]

I support the merge of this page with the page on Data Dredging. These are essentially the same concept by two different names. They should be on the same page. Maybe a disambiguation entry can be posted to differentiate these concepts. AjeetKhurana (talk) —Preceding undated comment added 13:58, 20 May 2009 (UTC).[reply]

Support merge - Data dredging is probably the best title (comment by John Quiggin, forgot to sign).

Do not merge - Bias through incorrect data-snooping is essentially different from the problem created by testing a hypothesis with the same data-set. For example, data-snooping bias may occur when dealing with an highly fluctuating set of data where every removal of a datapoint results in a new extreme, and so on. (Pc100935 11:59, 18 December 2006 (UTC))[reply]

Splitting a data-set parts A and B and then using part B to test a hypothesis formulated using part A is not recommended since these datasets can be highly correlated. Best practice is to formulate a hypothesis before looking at the data and use the data to test the hypothesis. If a hypothesis is based on existing data it should only be tested by collecting new independent data. (Pc100935 11:59, 18 December 2006 (UTC))[reply]

Support. Why hasn't the merge been made already? SweetNightmares (talk) 04:45, 21 January 2010 (UTC)[reply]

Support merge - while slightly different concepts are introduced by the two articles, there is no reason why that can't be written into one more coherent article that covers both. 94.195.129.125 (talk) 20:58, 5 April 2010 (UTC)[reply]

Merge done. 16:13, 30 November 2010 (UTC)

Global tag justified?[edit]

I don't think the global tag is justified: it's applied to an example of illegitimate hypothesis formation, but the example doesn't need to be universal! Richard Pinch (talk) 18:57, 11 June 2008 (UTC)[reply]

Well, it could certainly be rephrased in a more international manner, but I think the true problem is that the sentence is too long and not very clear, and doesn't bring a clear conclusion (why would it be wrong?) Calimo (talk) 15:11, 13 January 2009 (UTC)[reply]

Circumventing the scientific approach?[edit]

"Circumventing the traditional scientific approach of conducting an experiment without a hypothesis can lead to premature conclusions."

I believe the traditional scientific approach is to form a hypothesis before conducting an experiment, so the sentence should be rewritten to say, "Circumventing the traditional scientific approach by conducting an experiment without a hypothesis can lead to premature conclusions."

Unless someone knows better and objects, I will make the change. —Blanchette (talk) 06:57, 9 May 2011 (UTC)[reply]

Done. —Blanchette (talk) 21:19, 20 May 2011 (UTC)[reply]

Topics for new articles?[edit]

p-hacking, data peeking, and the replication crisis are related topics that probably deserve articles of their own. -- The Anome (talk) 10:26, 20 May 2014 (UTC)[reply]

Apparently p-hacking was written again in 2014, and in fact I would be in favour of that. Data-dredging is a more sophisticated approach, whereas P-hacking is much easier and mindlessly done. Viguarda (talk) 11:30, 20 January 2015 (UTC)[reply]

My first thoughts on reading this are that it doesn't seem to mention, or possibly distinguish, between the intentional and accidental cases. One could search through a data set for any statistically significant event, or for specific conclusions. Note that the latter is different from cherry picking, maybe cherry tree picking would be a better analogy. One selects the tree with the best fruit, picks all the fruit, (so as not to be accused of cherry picking), and then presents the results. Gah4 (talk) 00:03, 15 October 2016 (UTC)[reply]

Here's a current example of how p-hacking is used in common discourse:

The Inside Story Of How An Ivy League Food Scientist Turned Shoddy Data Into Viral Studies — 25 February 2018; syndicated at aldaily.com

One reason for the discrepancy is "p-hacking," the taboo practice of slicing and dicing a dataset for an impressive-looking pattern. It can take various forms, from tweaking variables to show a desired result, to pretending that a finding proves an original hypothesis — in other words, uncovering an answer to a question that was only asked after the fact.

I'm not thrilled with p-hacking redirecting to data dredging, a term I have never yet seen used in a mainstream, general-audience publication. p-hacking is a form of data dredging, with the specific end result of gaming an ethical bright line to gain a prominent office or pedestal of trust, which is ultimately more destructive to the scientific venture than mere academic dishonesty. — MaxEnt 08:43, 3 March 2018 (UTC)[reply]

Vague[edit]

This article has been watered down since I last read it, apparently in an effort to cast the topic in a more "neutral" light. Just the opening paragraph, for instance, now says: "Data dredging ... is the use of data mining to uncover relationships in data." That does not seem sufficient to me at all. All data mining is used for uncovering relationsships in data; the paragraph is almost tautological. The opening paragraph should instead succintly define data dredging as it differs from other ways of using data. If I can find reasonable sources, I may go ahead and rewrite some of it.--Anders Feder (talk) 10:03, 4 September 2014 (UTC)[reply]

Spurious Correlations[edit]

I was thinking of using an image from this site as a headline image since it explains the idea really well (it's Creative Commons Attribution-licensed). Any thoughts on this, or suggestions for a particularly ridiculous one? Blythwood (talk) 22:48, 3 June 2015 (UTC)[reply]

Second Vague, also second no merge[edit]

The article appears to vague to me, too and combines multiple problems into one which should be separated. More thorough mathematical derivations would be helpful in my opinion. Some core statements are known to be wrong, although still frequently mentioned as urban legends in social sciences (e.g., testing multiple stochastically independent hypotheses on the same data set is no problem. That distinction is not made in the page so far). I suggest to rewrite. — Preceding unsigned comment added by Timo von Oertzen (talk • contribs) 15:05, 30 January 2017 (UTC)[reply]

Drawing Conclusions from data section[edit]

I find the drawing conclusions from data section to be ridiculous and unsourced.

If it had been sourced we could maybe get to the bottom of the problems.

For example, read Deming's "Red Bead experiment" from his The New Economics and then read this section. A p-chart will tell you that immediately that there is nothing there are no conclusions whatsoever that can be drawn from a sample of 5 coin flips.

2601:14F:8005:B810:C823:C430:769F:E823 (talk) 16:43, 21 October 2017 (UTC)[reply]

p-hacking[edit]

(Edited) The following paper had a considerable impact in experimental psychology and might be worth mentioning in the article. One of its co-authors supposedly coined the term "p-hacking" though the term itself doesn't appear in the paper. I'll try to add it sometime if nobody else does, but if someone else does it first, that's great. Citation:

Simmons, Joseph P. and Nelson, Leif D. and Simonsohn, Uri, False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant (May 23, 2011). Psychological Science, 2011. Available at SSRN: https://ssrn.com/abstract=1850704

The authors wrote an invited retrospective 5 years later:

Simmons, Joseph P. and Nelson, Leif D. and Simonsohn, Uri, False-Positive Citations (March 27, 2017). Perspectives on Psychological Science, Forthcoming. Available at SSRN: https://ssrn.com/abstract=2916240

173.228.123.121 (talk) 02:38, 5 March 2018 (UTC)[reply]

1 in 4 statisticians say they were asked to commit scientific fraud[edit]

[1] Not sure where else to park this. In most of the cases the fraud was lighter stuff like underreporting non-significant results but sometimes they were asked to actually falsify data. 173.228.123.166 (talk) 07:07, 3 November 2018 (UTC)[reply]

thus dramatically increasing and understating the risk of false positives[edit]

Regarding the sentence with much editing and discussion (in edit summaries). It seems to me that the important point is getting results with an indication of statistical significance, while intentionally (usually) disregarding the truth. What might be called in a legal sense, wanton disregard for the truth. If you do it long enough, you might (un)luckily find something actually true. Consider a politician at a campaign rally energizing his base with unsubstantiated claims, and without any interest in the truth. Very rarely, he might accidentally say something true, but that is just random luck. Gah4 (talk) 01:49, 13 November 2019 (UTC)[reply]