“A small p-value indicates it’s improbable that the results are due to chance alone”–fallacious or not? (more on the ASA p-value doc)

There’s something about “Principle 2” in the ASA document on p-values that I couldn’t address in my brief commentary, but is worth examining more closely.

2. P-values do not measure (a) the probability that the studied hypothesis is true , or (b) the probability that the data were produced by random chance alone,

(a) is true, but what about (b)? That’s what I’m going to focus on, because I think it is often misunderstood. It was discussed earlier on this blog in relation to the Higgs experiments and deconstructing “the probability the results are ‘statistical flukes'”. So let’s examine:

2(b) P-values do not measure the probability that the data were produced by random chance alone,

We assume here that the p-value is not invalidated by either biasing selection effects or violated statistical model assumptions.

The basis for 2(b) is the denial of a claim we may call claim (1):

Claim (1): A small p-value indicates it’s improbable that the results are due to chance alone as described in H_0.

Principle 2(b) asserts that claim (1) is false. Let’s look more closely at the different things that might be meant in teaching or asserting (1) . How can we explain the common assertion of claim (1)? Say there is a one-sided test: H₀: μ = 0 vs. H₁:μ > 0 (Or, we could haveH₀: μ < 0 ).

Explanation #1: A person asserting claim (1) is using an informal notion of probability that is common in English. They mean a small p-value gives grounds (or is evidence) that H₁:μ > 0. Under this reading there is no fallacy.

Comment: If H₁ has passed a stringent test, a standard principle of inference is to infer H₁ is warranted. An informal notion of:

“So probably” H₁

is merely qualifying the grounds upon which we assert evidence for H₁. When a method’s error probabilities are used to qualify the grounds on which we assert the result of using the method, it is not to assign a posterior probability to a hypothesis. It is important not to confuse informal notions of probability and likelihood in English with technical, formal ones.

Explanation #2: A person asserting claim (1) is interpreting the p-value as a posterior probability of null hypothesis H₀based on a prior probability distribution: p = Pr(H₀|x). Under this reading there is a fallacy.

Comment: Unless the p-value tester has explicitly introduced a prior, this would be a most ungenerous interpretation of what is meant. Given that significance testing is part of a methodology that is directed to provide statistical inference methods whose validity does not depend on a prior probability distribution, it would be implausible to think a teacher of significance tests would mean a Bayesian posterior is warranted. Moreover, since a formal posterior probability assigned to a hypothesis doesn’t signal H₁ has been well-tested (as opposed to,say, it’s strongly believed), it seems an odd construal of what a tester means in asserting (1). The informal construal in explanation #1, is far more plausible.

A third explanation further illuminates why some assume this fallacious reading is intended.

Explanation #3: A person asserting claim (1) intends an ordinary error probability. Letting d(X) be the test statistic:

Pr(Test T produces d(X)>d(x); H₀) ≤ p.

(Note the definition of the p-value in my comment on the ASA statement.)

Notice: H₀ does not say the observed results are due to chance. It is just H₀:μ = 0. H₀ entails the observed results are due to chance, but that is different. Under this reading there is no fallacy.

Comment: R.A. Fisher was clear that we need not isolated significant results “but a reliable method of procedure” (see my commentary). We may suppose the tester follows Fisher and the test T consists of a pattern of statistically significant results indicating the effect. The probability that we’d be able to generate {d(X) > d(x)} in these experiments, in a world described by H₀, is very low (p). Equivalently:

Pr(Test T produces P-value < p; H₀) = p

The probability test T generates such impressively small p-values under the assumption they are due to chance alone is very small, p. Equivalently, a universe adequately described by H₀would produce such impressively small p-values only p(100)% of the time. Or yet another way:

Pr(Test T would not regularly produce such statistically significant results; were we in a world where H₀ ) = 1-p

Severity and the detachment of inferences

Admittedly, the move to inferring evidence of a non-chance discrepancy requires an additional principle of evidence that I have been calling the severity principle (SEV). Perhaps the weakest form is to a statistical rejection or falsification of the null.

Data x₀ from a test T provide evidence for rejecting H₀ (just) to the extent that H₀ would (very probably) have survived, were it a reasonably adequate description of the process generating the data (with respect to the question).

It is also captured by a general frequentist principle of evidence (FEV) (Mayo and Cox 2010), a variant on the general idea of severity (SEV) (EGEK 1996, Mayo and Spanos 2006, etc.).

The severity principle, put more generally:

Data from a test T (generally understood as a group of individual tests) provide good evidence for inferring H (just) to the extent that H passes severely with x₀, i.e., to the extent that H would (very probably) not have survived the test so well were H false.

Here H would be the rather weak claim of some discrepancy, but specific discrepancy sizes can (and should) be evaluated by the same means.

Conclusion. The only explanation under which claim (1) is a fallacy is the non-generous explanation #2. Thus, I would restrict principle 2 to 2(a). That said, I’m not claiming 2(b) is the ideal way to construe p-values. In fact, without being explicit about the additional principle that permits linking to the inference (the principle I call severity), it is open to equivocation. I’m just saying it’s typically meant as an ordinary error probability [2].

Souvenir: Don’t merely repeat what you hear about statistical methods (from any side) but, rather, think it through yourself.

Comments are welcome.[1]

Mayo, D. G. and Cox, D. R. (2006), “Frequentists Statistics as a Theory of Inductive Inference,” in Optimality: The Second Erich L. Lehmann Symposium, ed. J. Rojo, Lecture Notes-Monograph series, Institute of Mathematical Statistics (IMS), Vol. 49: 77-97.

Mayo, D. G. and Spanos, A. (2006), “Severe Testing as a Basic Concept in a Neyman–Pearson Philosophy of Induction,” British Journal for the Philosophy of Science 57(2): 323–57.

My comment, “Don’t throw out the error control baby with the bad statistics bathwater is #17 under the supplementary materials:

[1] I have this old Monopoly game from my father that contains metal pieces like this top hat. There’s also a racing car, a thimble and more.

[2] The error probabilities come from the sampling distribution and are often said to be “”hypothetical”. I see no need to repeat “hypothetical” in alluding to error probabilities.

“A small p-value indicates it’s improbable that the results are due to chance alone”–fallacious or not? (more on the ASA p-value doc)

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112