When planning a clinical trial investigators can generally choose between two related analysis strategies: Significance testing (requiring p-values) or Measuring (requiring confidence intervals).
In our opinion significance testing is the wrong option:
Confidence intervals supply the reader with more information than p-values. Consider the following example:
Example: Biased coin toss.
A coin is bent with some pliers. Investigators are interested in the properties of this modified coin. Is is biased? If so in which direction?
A simple experiment is devised, the coin is tossed 100 times and the outcome recorded (66 heads).
Two different investigators analyse the data: One chooses to Test, the other to Measure…
Investigator 1: “The bent coin was significantly biased (p = 0.0037)” under duress to provide more information they concede “The bent coin was significantly biased (Heads 66%; p = 0.0037).
Investigator 2: “The bent coin favoured Heads 66% (95%CI; 55% to 74%)”
In our opinion Statistician 1 has taken the wrong turn by opting for significance testing, while the information supplied is useful we want deeper insights than this. We want information about the size of the observed treatment effects and their precision. Psychiatric research seems to be stuck in the old fashioned rut of significance testing, a form of statistical ritual.
Don’t take the wrong turn with your analysis plan. Avoid mindless, wholesale testing. Measure (or estimate) the treatment effect and report its associated confidence interval. In our opinion the confidence interval should not be used to infer whether a treatment effect is statistically significant, instead it should be interpreted as indicating a plausible range of treatment effects suggested by the trial’s results.
Suggested reading: There is a literary canon on this topic…
Mindless Statistics “Statistical rituals largely eliminate statistical thinking…”