"from the FDA all publicly releasable information about the clinical trials for efficacy conducted for marketing approval of fluoxetine, venlafaxine, nefazodone, paroxetine, sertraline, and citalopram, the six most widely prescribed antidepressants approved between 1987 and 1999".So it looks at the evidence available to the FDA at the time it licenced these SSRIs (they aren't actually all SSRIs), but not necessarily all the evidence that is in fact available on these SSRIs, in particular most studies were only for six weeks and despite their conclusions about mild depression only one study actually looked at mild depression, the authors reach their conclusions primarily by extrapolating a regression line.
What they find, in summary, is that there is in fact a statistically greater benefit of the SSRIs over placebo, but that this difference was below the criteria that NICE use to determine clinical significance*. They also find that efficacy increases (relative to placebo) as the severity of depression increases reaching NICE's criteria for severe depression (see their Figure 2, or here).
They make something of the fact that the greater difference in severe depression is driven by a reduction in the efficacy of placebo, but that seems neither here nor there really - in fact it suggests that the very high placebo response rate for less severe forms of depression is masking the response to SSRIs (a problem well known in other areas such as low back pain).
It is worth noticing that NICE already recommends that:
"In mild and moderate depression, consider psychological treatment specifically focused on depression (problem-solving therapy, brief CBT and counselling) of 6 to 8 sessions over 10 to 12 weeks...Antidepressants are not recommended for the initial treatment of mild depression, because the risk–benefit ratio is poor."Although they also recommend:
"In moderate depression, offer antidepressant medication to all patients routinely, before psychological interventions...CBT is the psychological treatment of choice. Consider interpersonal psychotherapy (IPT) if the patient expresses a preference for it or if you think the patient may benefit from it...For patients who have not made an adequate response to other treatments for depression (for example, antidepressants and brief psychological interventions), consider giving a course of CBT of 16 to 20 sessions over 6 to 9 months."There was some mention in the news coverage this morning that 'talking' therapies would be a better idea instead of SSRIs. I am always interested in this view, which is very common, because there is little evidence that talking therapies, and in particular the best studied therapy CBT, are any better than medication or any cheaper (which isn't to say we don't need more clinical psychologists).
* Worth noting here I think that there is a difference between something being licenced because it is relatively safe and effective, and thus a permitted drug - as the FDA (in the US) or MHRA (in the UK) do - and something being cost effective - as NICE seeks to determine for the NHS in the UK.
The NICE criteria are
"For continuous outcomes for which an SMD [standardised mean difference] was calculated (for example, when data from different versions of a scale are combined), an effect size of ~0.5 (a ‘medium’ effect size (Cohen, 1988)) or higher was considered clinically significant. Where a WMD [weighted mean difference] was calculated, a between group difference of at least 3 points (2 points for treatment-resistant depression) was considered clinically significant for both BDI and HRSD [Hamilton Rating Scale for Depression]...Where an ES [effect size] was statistically significant, but not clinically significant and the CI [confidence interval] excluded values judged clinically important, the result was characterised as ‘unlikely to be clinically significant’ (S3). Alternatively, if the CI included clinically important values, the result was characterised as ‘insufficient to determine clinical significance’ (S6)."And NICE found that:
"There is evidence suggesting that there is a statistically significant difference favouring SSRIs over placebo on reducing depression symptoms as measured by the HRSD but the size of this difference is unlikely to be of clinical significance (N= 16; n= 2223; Random effects SMD= -0.34; 95% CI, -0.47 to -0.22).So NICE's findings were not all that dissimilar to those of this study:
In moderate depression there is evidence suggesting that there is a statistically significant difference favouring SSRIs over placebo on reducing depression symptoms as measured by the HRSD but the size of this difference is unlikely to be of clinical significance (N= 2; n= 386; SMD= -0.28; 95% CI, -0.48 to -0.08).
In severe depression there is some evidence suggesting that there is a clinically significant difference favouring SSRIs over placebo on reducing depression symptoms as measured by the HRSD (N= 4; n= 344; SMD= -0.61; 95% CI, -0.83 to -0.4).
In very severe depression there is evidence suggesting that there is a statistically significant difference favouring SSRIs over placebo on reducing depression symptoms, as measured by the HRSD, but the size of this difference is unlikely to be of clinical significance (N= 5; n= 726; SMD= -0.39; 95% CI, -0.54 to -0.24)."
"weighted mean improvement was 9.60 points on the HRSD in the drug groups and 7.80 in the placebo groups, yielding a mean drug–placebo difference of 1.80 on HRSD improvement scores...the standardized mean difference, d, mean change for drug groups was 1.24 and that for placebo 0.92, both of extremely large magnitude according to conventional standards. Thus, the difference between improvement in the drug groups and improvement in the placebo groups was 0.32, which falls below the 0.50 standardized mean difference criterion that NICE suggested."Of course the Cohen medium effect size criteria are completely arbitrary (even if not unreasonable; but see here for a discussion of whether Kirsch et al actually measured a true Cohen d effect size) and note that NICE is a lot more circumspect in dealing with statistically significant differences that they do not deem to be clinically significant than the authors of the PLoS study.
Also note that NICE also found:
"There is strong evidence suggesting that there is a clinically significant difference favouring SSRIs over placebo on increasing the likelihood of patients achieving a 50% reduction in depression symptoms as measured by the HRSD (N = 1742; n = 3143; RR = 0.73; 95% CI, 0.69 to 0.78)."And in the PLoS study dichomotous results such as 50% reductions on the HRSD are not addressed, only average changes in the HRSD score.
Turner & Rosenthal (from the NEJM paper on selective publication in antidepressant trials) have an interesting editorial on this topic in the BMJ, where they say:
Interestingly, even Moncrieff & Kirsch say:
"Clinical significance is an important concept because a clinical trial can show superiority of a drug to placebo in a way that is statistically, but not clinically, significant. Tests of statistical significance give a yes or no answer (for example, P<0.05>0.05 non-significant) that tells us whether the true effect size is zero or not, but it tells us nothing about the size of the effect.3 In contrast, effect size does, and thus allows us to look at the question of clinical significance. Values of 0.2, 0.5, and 0.8 were proposed to represent small, medium, and large effects, respectively.4
NICE chose the "medium" value of 0.5 as a cut-off below which they deem benefit of a drug not clinically significant.5 This is problematic because it transforms effect size, a continuous measure, into a yes or no measure, thereby suggesting that drug efficacy is either totally present or absent, even when comparing values as close together as 0.51 and 0.49. Kirsch and colleagues compared their effect size of 0.32 to the 0.50 cut-off and concluded that the benefits of antidepressant drugs were of no clinical significance.
But on what basis did NICE adopt the 0.5 value as a cut-off? When Cohen first proposed these landmark effect size values, he wrote, "The terms ‘small’, ‘medium’, and ‘large’ are relative . . . to each other . . . the definitions are arbitrary . . . these proposed conventions were set forth throughout with much diffidence, qualifications, and invitations not to employ them if possible." He also said, "The values chosen had no more reliable a basis than my own intuition." Thus, it seems doubtful that he would have endorsed NICE’s use of an effect size of 0.5 as a litmus test for drug efficacy. "
"No research evidence or consensus is available about what constitutes a clinically meaningful difference in Hamilton scores, but it seems unlikely that a difference of less than 2 points could be considered meaningful. NICE required a difference of at least 3 points as the criterion for clinical importance but gave no justification for this figure."