Matches in SemOpenAlex for { <https://semopenalex.org/work/W2079784016> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W2079784016 endingPage "297" @default.
- W2079784016 startingPage "295" @default.
- W2079784016 abstract "I am delighted to be invited to comment on the use of P-values, but at the same time, it depresses me. Why? So much brainpower, ink, and passion have been expended on this subject for so long, yet plus ca change, plus c’ést le meme chose– the more things change, the more they stay the same. The references on this topic encompass innumerable disciplines, going back almost to the moment that P-values were introduced (by R.A. Fisher in the 1920s). The introduction of hypothesis testing in 1933 precipitated more intense engagement, caused by the subsuming of Fisher’s “significance test” into the hypothesis test machinery. 1–9 The discussion has continued ever since. I have been foolish enough to think I could whistle into this hurricane and be heard. 10–12 But we (and I) still use P-values. And when a journal like Epidemiology takes a principled stand against them, 13 epidemiologists who may recognize the limitations of P-values still feel as if they are being forced to walk on one leg. 14 So why do those of us who criticize the use of P-values bother to continue doing so? Isn’t the “real world” telling us something – that we are wrong, that the effort is quixotic, or that this is too trivial an issue for epidemiologists to spend time on? Admittedly, this is not the most pressing methodologic issue facing epidemiologists. Still, I will try to argue that the topic is worthy of serious consideration. Let me begin with an observation. When epidemiologists informally communicate their results (in talks, meeting presentations, or policy discussions), the balance between biology, methodology, data, and context is often appropriate. There is an emphasis on presenting a coherent epidemiologic or pathophysiologic “story,” with comparatively little talk of statistical “rejection” or other related tomfoolery. But this same sensibility is often not reflected in published papers. Here, the structure of presentation is more rigid, and statistical summaries seem to have more power. Within these confines, the narrative flow becomes secondary to the distillation of complex data, and inferences seem to flow from the data almost automatically. It is this automaticity of inference that is most distressing, and for which the elimination of P-values has been attempted as a curative. Although I applaud the motivation of attempts to eliminate P-values, they have failed in the past and I predict that they will continue to fail. This is because they treat the symptoms and not the underlying mindset, which must be our target. We must change how we think about science itself. I and others have discussed the connections between statistics and scientific philosophy elsewhere, 11,12,15–22 so I will cut to the chase here. The root cause of our problem is a philosophy of scientific inference that is supported by the statistical methodology in dominant use. This philosophy might best be described as a form of “naïve inductivism,”23 a belief that all scientists seeing the same data should come to the same conclusions. By implication, anyone who draws a different conclusion must be doing so for nonscientific reasons. It takes as given the statistical models we impose on data, and treats the estimated parameters of such models as direct mirrors of reality rather than as highly filtered and potentially distorted views. It is a belief that scientific reasoning requires little more than statistical model fitting, or in our case, reporting odds ratios, P-values and the like, to arrive at the truth. How is this philosophy manifest in research reports? One merely has to look at their organization. Traditionally, the findings of a paper are stated at the beginning of the discussion section. It is as if the finding is something derived directly from the results section. Reasoning and external facts come afterward, if at all. That is, in essence, naïve inductivism. This view of the scientific enterprise is aided and abetted by the P-value in a variety of ways, some obvious, some subtle. The obvious way is in its role in the reject/accept hypothesis test machinery. The more subtle way is in the fact that the P-value is a probability – something absolute, with nothing external needed for its interpretation. Now let us imagine another world – a world in which we use an inferential index that does not tell us where we stand, but how much distance we have covered. Imagine a number that does not tell us what we know, but how much we have learned. Such a number could lead us to think very differently about the role of data in making inferences, and in turn lead us to write about our data in a profoundly different manner. This is not an imaginary world; such a number exists. It is called the Bayes factor. 15,17,25 It is the data component of Bayes Theorem. The odds we put on the null hypothesis (relative to others) using data external to a study is called the “prior odds,” and the odds after seeing the data is the “posterior odds.” The Bayes factor tells us how far apart those odds are, ie, the degree to which the data from a study move us from our initial position. It is quite literally an epistemic odds ratio, the ratio of posterior to prior odds, although it is calculable from the data, without those odds. It is the ratio of the data’s probability under two competing hypotheses. 15,17 If we have a Bayes factor equal to 1/10 for the null hypothesis relative to the alternative hypothesis, it means that these study results have decreased the relative odds of the null hypothesis by 10-fold. For example, if the initial odds of the null were 1 (ie, a probability of 50%), then the odds after the study would be 1/10 (a probability of 9%). Suppose that the probability of the null hypothesis is high to begin with (as they typically are in data dredging settings), say an odds of 9 (90%). Then a 10-fold decrease would change the odds of the null hypothesis to 9/10 (a probability of 47%), still quite probable. The Bayes factor is a measure of evidence in the same way evidence is viewed in a legal setting, or informally by scientists. Evidence moves us in the direction of greater or lesser doubt, but except in extreme cases it does not dictate guilt or innocence, truth or falsity. I should warn readers knowledgeable in Bayesian methods to stop here. They may be severely disappointed (or even horrified) by the proposal I am about to make. I suggest that the Bayes factor does not necessarily have to be derived from a standard Bayesian analysis, although I would prefer that it were. As a simple alternative, it is possible instead to use the minimum Bayes factor (for the null hypothesis). 26 The appeal of the minimum Bayes factor is that it is calculated from the same information that goes into the P-value, and can easily be derived from standard analytic results, as described below. Quantitatively, it is only a small step from the P-value (and shares the liability of confounding the effect size with its precision). But conceptually, it is a huge leap. I recommend it not as a cure-all, but as a practical first step toward methodologic sanity. The calculation goes like this. If a statistical test is based on a Gaussian approximation (as they are in many epidemiologic analyses), the strongest Bayes factor against the null hypothesis is exp(−Z2/2), where Z is the number of standard errors from the null value. Thus it can be applied to most regression coefficients (whose significance is typically based on some form of normal approximation) and contingency tables. (When the t-statistic is used, it can substitute for Z.) If the log-likelihood of a model is reported, the minimum Bayes factor is simply the exponential of the difference between the log-likelihoods of two competing models (ie, the ratio of their maximum likelihoods). This likelihood-ratio (the minimum Bayes factor) is the basis for most frequentist analyses. While it is invariably converted into a P-value, it has inferential meaning without such conversion. The minimum Bayes factor described above does not involve a prior probability distribution over non-null hypotheses; it is a global minimum for all prior distributions. However, there is also a simple formula for the minimum Bayes factor in the situation where the prior probability distribution is symmetric and descending around the null value. This is −ep ln(p), 27,28 where p is the fixed-sample size P-value. The table shows the correspondence between P-values, Z- (or t-) scores, and the two forms of minimum Bayes factors described above. Note that even the strongest evidence against the null hypothesis does not lower its odds as much as the P-value magnitude might lead people to believe. More importantly, the minimum Bayes factor makes it clear that we cannot estimate the credibility of the null hypothesis without considering evidence outside the study. This translation from P-value to minimum Bayes factor is not merely a recalibration of our evidential measure, like converting from Fahrenheit to Celsius. By assessing the result with a minimum Bayes factor, we bring into play a different conceptual framework, which requires us to separate statistical results from inductive inferences. Reading from Table 1, a P-value of 0.01 represents a “weight of evidence” for the null hypothesis of somewhere between 1/25 (0.04)) and 1/8 (0.13). In other words, the relative odds of the null hypothesis vs any alternative are at most 8–25 times lower than they were before the study. If I am going to make a claim that a null effect is highly unlikely (eg, less than 5%), it follows that I should have evidence outside the study that the prior probability of the null was no greater than 60%. If the relationship being studied is far-fetched (eg, the probability of the null was greater than 60%), the evidence may still be too weak to make a strong knowledge claim. Conversely, even weak evidence in support of a highly plausible relationship may be enough for an author to make a convincing case. 15,17Table 1: Bayesian Interpretations of P -ValuesThe use of the Bayes factor could give us a different view of results and discussion sections. In the results section, both the data and model-based data summaries are presented. (The choice of a mathematical model can be regarded as an inferential step, but I will not explore that here.) This can be followed by an index like the Bayes factor if two hypotheses are to be contrasted. The discussion section should then serve as a bridge between these indices and the conclusions. The components of this bridge are the plausibility of the proposed mechanisms, (drawing on laboratory, other experimental evidence and patterns within this data), other empirical results related to this hypothesis and the qualitative strength of the current study’s design and execution. P-values need not be banned, although I would be happy to see them go. (When I see them, I translate them into approximate Bayes factors.) But we should certainly ban inferential reasoning based on the naïve use of P-values and hypothesis tests, and their various partners in crime, eg, stepwise regression (which chooses regression terms based exclusively on statistical significance, widely recognized as egregiously biased and misleading). 29,30 Even without formal Bayesian analysis, the use of minimum Bayes factors (along with, or in lieu of, P-values) might provide an antidote for the worst inferential misdeeds. More broadly, we should incorporate a Bayesian framework into our writing, and not just our speaking. We should describe our data as one source of information among many that make a relationship either plausible or unlikely. The use of summaries such as the Bayes factor encourages that, while use of the P-value makes it nearly impossible. Changing the P-value culture is just a beginning. We utilize powerful tools to organize data and to guess at the reality which gave rise to them. We need to remember that these tools can create their own virtual reality. 17,30,31 The object of our study must be nature itself, not artifacts of the tools we use to probe its secrets. If we approach our data with respect for their complexity, with humility about our ability to sort that out, and with detailed knowledge of the phenomena under study, we will serve our science and the public health well. From that perspective, whether or not we use P-values seems, well, insignificant." @default.
- W2079784016 created "2016-06-24" @default.
- W2079784016 creator A5059008211 @default.
- W2079784016 date "2001-05-01" @default.
- W2079784016 modified "2023-10-16" @default.
- W2079784016 title "Of P-Values and Bayes: A Modest Proposal" @default.
- W2079784016 cites W1917216281 @default.
- W2079784016 cites W1967485420 @default.
- W2079784016 cites W1971063067 @default.
- W2079784016 cites W1975468890 @default.
- W2079784016 cites W1997013992 @default.
- W2079784016 cites W2005332989 @default.
- W2079784016 cites W2030346622 @default.
- W2079784016 cites W2030360178 @default.
- W2079784016 cites W2044826543 @default.
- W2079784016 cites W2059424544 @default.
- W2079784016 cites W2062381998 @default.
- W2079784016 cites W2079297380 @default.
- W2079784016 cites W2081855756 @default.
- W2079784016 cites W2086600609 @default.
- W2079784016 cites W2107932039 @default.
- W2079784016 cites W2129925362 @default.
- W2079784016 cites W2152938409 @default.
- W2079784016 cites W2254057084 @default.
- W2079784016 cites W4211177544 @default.
- W2079784016 cites W4243759473 @default.
- W2079784016 cites W4250996842 @default.
- W2079784016 doi "https://doi.org/10.1097/00001648-200105000-00006" @default.
- W2079784016 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/11337600" @default.
- W2079784016 hasPublicationYear "2001" @default.
- W2079784016 type Work @default.
- W2079784016 sameAs 2079784016 @default.
- W2079784016 citedByCount "133" @default.
- W2079784016 countsByYear W20797840162012 @default.
- W2079784016 countsByYear W20797840162013 @default.
- W2079784016 countsByYear W20797840162014 @default.
- W2079784016 countsByYear W20797840162015 @default.
- W2079784016 countsByYear W20797840162016 @default.
- W2079784016 countsByYear W20797840162017 @default.
- W2079784016 countsByYear W20797840162018 @default.
- W2079784016 countsByYear W20797840162019 @default.
- W2079784016 countsByYear W20797840162020 @default.
- W2079784016 countsByYear W20797840162021 @default.
- W2079784016 countsByYear W20797840162022 @default.
- W2079784016 countsByYear W20797840162023 @default.
- W2079784016 crossrefType "journal-article" @default.
- W2079784016 hasAuthorship W2079784016A5059008211 @default.
- W2079784016 hasBestOaLocation W20797840161 @default.
- W2079784016 hasConcept C105795698 @default.
- W2079784016 hasConcept C107673813 @default.
- W2079784016 hasConcept C207201462 @default.
- W2079784016 hasConcept C33923547 @default.
- W2079784016 hasConceptScore W2079784016C105795698 @default.
- W2079784016 hasConceptScore W2079784016C107673813 @default.
- W2079784016 hasConceptScore W2079784016C207201462 @default.
- W2079784016 hasConceptScore W2079784016C33923547 @default.
- W2079784016 hasIssue "3" @default.
- W2079784016 hasLocation W20797840161 @default.
- W2079784016 hasLocation W20797840162 @default.
- W2079784016 hasLocation W20797840163 @default.
- W2079784016 hasOpenAccess W2079784016 @default.
- W2079784016 hasPrimaryLocation W20797840161 @default.
- W2079784016 hasRelatedWork W1587224694 @default.
- W2079784016 hasRelatedWork W1979597421 @default.
- W2079784016 hasRelatedWork W2007980826 @default.
- W2079784016 hasRelatedWork W2061531152 @default.
- W2079784016 hasRelatedWork W2077600819 @default.
- W2079784016 hasRelatedWork W2142036596 @default.
- W2079784016 hasRelatedWork W2911598644 @default.
- W2079784016 hasRelatedWork W3002753104 @default.
- W2079784016 hasRelatedWork W4225152035 @default.
- W2079784016 hasRelatedWork W4245490552 @default.
- W2079784016 hasVolume "12" @default.
- W2079784016 isParatext "false" @default.
- W2079784016 isRetracted "false" @default.
- W2079784016 magId "2079784016" @default.
- W2079784016 workType "article" @default.