this post was submitted on 09 Jul 2023
76 points (100.0% liked)
Science
13032 readers
2 users here now
Studies, research findings, and interesting tidbits from the ever-expanding scientific world.
Subcommunities on Beehaw:
Be sure to also check out these other Fediverse science communities:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It's sad that they keep using flawed statistical methods in these studies...
Correction: as @Gaywallet@beehaw.org points out, they also use other statistical methods within the paper!
While taking issue with p-values is a valid stance, the paper uses confidence intervals and bayesian methods (cubic splines) in addition to p-values, both of the proposed alternatives in the ASA's statement that you mentioned below.
While p-values are listed, there's stats which fall in line with the recommendations in this very paper. If you take issue with either of these methods, could you help explain to me why you're upset? Or is it just the fact that p-values are stated rather than focusing on the CI and bayesian results? I personally think there's value to still showing a p-value because it makes it slightly more approachable to the non-scientific or statistical crowd, so long as it's not used to distract from poor fit of other models.
No, that's my bad, thank you for correcting me! I only read the abstract, and they don't mention Bayesian methods there. Confidence intervals suffer from similar flaws as p-values and statistical significance.
It's great that they do analyses with other methods too indeed. Not, from my point of view, because they're more approachable – quite the opposite: people think in terms of probabilities-of-the-hypotheses, and p-values are not that (that's one source of their misuse). But because it helps the transition to other methods. It'd been nice if they had stated the results from all methods in the abstract. But that'll be for next time maybe!
Cool thanks for clarifying! While I am a data scientist I am not a stats expert so always looking to understand proper critiques from those more knowledge than me 😄
Thank you for the correction. Don't trust me, though: check out the proofs and discussions in the references here, see for yourself :)
The statistical method is not flawed. Many scientific communities are misinterpreting or abusing it - thats the problem.
P-values-based methods and statistical significance are flawed: even when used correctly (e.g.: stopping rule decided beforehand, various "corrections" of all kinds for number of datapoints, non-gaussianity, and so on), one can get results that are "statistically non-significant" but clearly significant in all common-sense meanings of this word; and vice-versa. There's a constant literature – with mathematical and logical proofs – dating back from the 1940s pointing out the in-principle flaws of "statistical significance" and null-hypothesis testing. The editorial from the American Statistical Association gives an extensive list.
I'd like to add: I'm saying this not because I read it somewhere (I don't like unscientific "my football team is better than yours"-like discussions), but because I personally sat down and patiently went through the proofs and counterexamples, and the (almost non-existing) counter-proofs. That's what made me change methodology. This is something that many researchers using "statistical significance" have not done.
This is interesting and something I've not heard of - can you recommend a starter link for someone with a basic stats background? I had some in undergrad, but this sounds like a topic that could get very tinfoil-hat-y if not searched correctly and with good context.
There's still a lot of debate around this topic. It's obviously difficult for people who have used these methods for the past 60 years to simply say "I've been using a flawed method for 60 years" – although in the end that's how science works. The problem moreover is double: the method has built-in flaws, and on top of that it's often misused.
Some starters:
The official statement by the American Statistical Association
A follow-up editorial
Signatories for the dismissal of the method
Many papers explaining the built-in flaws, from this old 1935 paper and this old 1965 discussion, to more recent ones; for example this, or this, or this, or this, or this tutorial
This paper gives a good summary
Journals that don't accept "statistical significance" methods anymore: this or this
Several books, for example this one. I agree with the factual content of this book, but I don't like the authors's braggart way of writing. In their defence, though: it's the same braggart way of writing that R. A. Fisher, the father of "statistical significance", often had.
What's sad is that these discussions easily end in political or "football-team"-like debates. But the mathematical and logical proofs are there, for those who care to go and read them.
Thanks, I appreciate it - looks like I've got some bedtime reading for awhile :)
My pleasure!
Ah, I thought you were talking about p-values - which is just a simple metric and gets a bad rep from being used for statistical significance. Statistical significance certainly is trash.
Yes I'm talking about p-values. Statistical "significance" is based on p-values.