Education: The RCT safety fallacy

The RCT’s Achilles tendon is often safety analysis. RCTs are designed with adequate power to detect useful treatment effects usually by having a sufficiently large sample size N. However the analysis of adverse events gets its power from the number of observed adverse events E. In most cases a “fully powered” RCT has very little power to detect systematic differences in adverse event rates. This means reporting of safety data must be done with skill and caution. Investigators should resist doing significance tests on safety data or take extreme care when interpreting them.

Example: In order to see the absurdity of performing and strictly interpreting significance testing of low count safety data consider the Zoo-keeper’s dilemma:

After recent fatalities in a small zoo (which only has 2 animals) workers have become reluctant to work in the lion enclosure. A concerned zoo-keeper decides to look into the problem and perform some safety analyses. After carefully recording data from 20 summer interns she performs a statistical test in order to see if the results indicate a systematic increased mortality risk in working in the lion enclosure (see calculations below). This analysis was based on reading highly accessed, peer reviewed journal articles in psychiatry.


The RCT safety fallacy is a pernicious variant of a futile low-powered comparison. A thoughtful discussion of the raw data with minimal statistical analysis is the correct treatment for most safety data.

Suggested reading:
An Extension of the CONSORT Statement
What the courts say…
Here’s what a fully powered safety RCT looks like