Does hormone replacement therapy cause breast cancer? An application of causal principles to three studies Part 4. The Million Women Study
Samuel Shapiro,1 Richard D T Farmer,2 John C Stevenson,3 Henry G Burger,4 Alfred O Mueck5
Abstract
Part 1 we concluded that the CR findings
Background Based principally on fi ndings in three
studies, the collaborative reanalysis (CR), the
Town, Cape Town, South Africa2Emeritus Professor of
Women’s Health Initiative (WHI) and the Million
(E+P)] did not establish causality. In Part
Women Study (MWS), it is claimed that hormone
2 we concluded that the WHI findings for
replacement therapy (HRT) with estrogen plus
progestogen (E+P) is now an established cause
trast, in Part 3 we concluded that valid
of breast cancer; the CR and MWS investigators
claim that unopposed estrogen therapy (ET) also
not increase the risk of breast cancer, and
increases the risk, but to a lesser degree than
may even decrease it; the latter possibility,
does E+P. The authors have previously reviewed
however, was statistically borderline.
the fi ndings in the CR and WHI (Parts 1–3).
Objective To evaluate the evidence Methods Using generally accepted causal criteria,
reported an increased risk of breast can-
in this article (Part 4) the authors evaluate the
cer in HRT users,21 and based on the com-
fi ndings in the MWS for E+P and for ET. Results Despite the massive size of the MWS the
fi ndings for E+P and for ET did not adequately
E+P is an established and major cause of
satisfy the criteria of time order, information bias,
detection bias, confounding, statistical stability
not the WHI investigators)15–18 claim that
and strength of association, duration-response,
ET also increases the risk, although to a
internal consistency, external consistency or
Correspondence to Professor Samuel Shapiro,
biological plausibility. Had detection bias resulted
in the identifi cation in women aged 50–55 years
ples to the evidence from the MWS.21–24 In
of 0.3 additional cases of breast cancer in ET
the MWS the estimated levels of risk asso-
users per 1000 per year, or 1.2 in E+P users, it
Anzio Road, Observatory, Cape Town, South Africa;
would have nullifi ed the apparent risks reported. Conclusion HRT may or may not increase
of the impact the study had on regulatory
authorities, and on the public perception
of safety, it is especially important to eval-
Background The Million Women Study21–24
In Parts 1–3 of this series of articles we
ological principles of causality1–4 to stud-
mography at 3-year intervals.21 From May
ies of the risk of breast cancer in users of
tigators sent letters and questionnaires25
reported from the collaborative reanalysis
(CR)5 (Part 16), and the Women’s Health
questionnaires26 were sent 2–3 years after
Initiative (WHI)7–18 (Parts 219 and 320). In
Shapiro S, Farmer RDT, Stevenson JC, et al.Family Planning (2011). doi: 1136/jfprhc-2011-100229
for breast cancer incidence and mortality in National
the first two-thirds of the study population”, and there
were “392 341 (38%) women for whom follow-up
Below, except where otherwise stated, all 95% con-
information [was] included in [the] analysis”.
fidence intervals (CIs) around the relative risk (RR)
Among current users of HRT the respective RRs of
estimates excluded 1.0, and for convenience they are
in situ and invasive breast cancerwere 1.55 and 1.74.
The RRs were higher for invasive mixed ductal-lobu-lar or tubular tumours (2.13 and 2.66) than for duc-
First report21 (2003)
tal tumours (1.63); the RRs were also higher among
Among 828 923 postmenopausal women followed for
E+P than among ET users, but for each type of cancer
an average of 2.6 years the RRs of invasive breast cancer
the RRs did not increase significantly with increasing
for current and past users of HRT were 1.66 and 1.01
duration of use. For ductal and lobular tumours the
(95% CI, 0.94–1.09). Among women currently using
RRs declined with increasing body mass index (BMI)
HRT at baseline the RRs for users of various types of
HRT were as follows: ET, 1.30; E+P, 2.00; tibolone,
The investigators concluded that “the risks of inva-
1.45; other or unknown HRT, 1.44. The difference
sive lobular and tubular cancers associated with cur-
between E+P vs ET was significant (p<0.0001).
rent use of both [ET and E+P] are higher than for
For current ET use at baseline the RRs for <5 and
invasive ductal cancer” and higher for E+P users than
≥5 years’ total duration were 1.21 and 1.34, and for
E+P use, 1.70 and 2.21. For ET use the RRs for total durations of <1, 1–4, 5–9 and ≥10 years of use were
Fourth report24 (2011)
0.81 (95% CI, 0.55–1.20), 1.25, 1.32 and 1.37; for
Among 1 129 025 postmenopausal women followed
E+P use they were 1.45, 1.74, 2.17 and 2.31.
until “the end of 2002 … two thirds of the partici-
Among women who last used HRT ≤1 year pre-
pants had been mailed the second questionnaire and
viously the RR was 1.14; for exposures that ended
the response was 65%”. During 4.05 million WY of
2–≥10 years previously the RRs approximated unity.
follow-up 15 759 invasive and in situ breast cancers
The average time to diagnosis was 1.2 years, and
within 1.7 years of diagnosis the RR of fatal breast
The RRs for current users of HRT, ET, E+P, tibo-
lone and other and unknown HRT were 1.68, 1.38,
The investigators estimated that the “use of HRT
1.96, 1.38 and 1.55, respectively, and the estimates
by UK women aged 50–64 years … resulted in an
were statistically heterogeneous (p<0.001). In the
extra 20 000 incident breast cancers, combined [E+P]
first 2 years after HRT ceased the RR was 1.16, after
accounting for 15 000” of them. They also estimated
which the RRs approximated unity. For durations of
that HRT would “result in five to six extra cancers per
use of <5 and ≥5 years the respective RRs among ET
1000 women with 5 years’ use and 15–19 … per 1000
users were 1.24 and 1.44; among E+P users they were
with 10 years’ use”. They concluded that “current use
of HRT is associated with an increased risk of incident
For both ET and E+P users the RRs were lower for
and fatal breast cancer” … [which is] … “substantially
breast cancers diagnosed in the first 4 months after
greater for [E+P] combinations than for other types
recruitment than subsequently [ET, 1.19 and 1.50
(p<0.001); E+P, 1.41 and 2.32 (p<0.001)]. For ET users the RRs of ‘screen-detected’ and ‘non-screen-
Second report22 (2004)
detected’ cancers were 1.16 and 1.59 (p<0.001);
Among users of HRT at baseline the RRs at 0.1 (‘screen-
for E+P the corresponding estimates were 1.64 and
detected’), 0.7, 1.5, 2.5 and 3.4 years of follow-up
2.81 (p<0.001). Those comparisons “should [have
were 1.37, 2.66, 2.16, 1.66 and 1.70, respectively. The
included] virtually all breast cancers found at screening
average durations of use ranged from 6.1 to 6.9 years.
soon after the baseline questionnaire was completed”.
The RRs were higher for E+P than for ET users, and
For current ET users whose use began <5 and ≥5
maximal at 0.7 years (ET, 1.72; E+P, 3.31).
years after the menopause, the RRs were 1.43 and
For women aged 50–55 years who used HRT for 5
1.05 (p<0.001); for E+P users the estimates were 2.04
years the estimated absolute risks attributable to ET
and 1.53 (p<0.001). “The proportionate increase in
and E+P use were 1.5 and 6.0 per 1000.
risks of breast cancer associated with use of hormone therapy was greater among lean women than among
Third report23 (2006)
obese women”, but within BMI strata (≥25 kg/m2 and
Among 1 031 224 postmenopausal women followed
<25 kg/m2) the HRT-associated RRs remained higher
over 3.6 million woman-years (WY) for the incidence
for those whose use commenced <5 years after the
of invasive and in situ breast cancer “the mean time …
from … last contact to the end of follow-up was 2.7
For both ET and E+P users the RRs declined with
years [SD (standard deviation)1.1]”. “At the time of
increasing tumour grade (Grades I–III): ET, 1.27, 1.16,
the analysis follow-up information was available for
0.87 (p<0.001); E+P, 2.42, 1.67, 1.03 (p<0.001).
Shapiro S, Farmer RDT, Stevenson JC, et al.Family Planning (2011). doi: 1136/jfprhc-2011-100229
For estrogen receptor (ER)-positive vs ER-negative
Information bias
status the RRs for ET users were 1.76 and 1.29
Information bias in a cohort study is unusual, but it can
(p=0.005); for E+P users the estimates were 3.10 and
occur, and in the MWS it was likely. At recruitment
1.37 (p<0.001). For node-positive vs node-negative
HRT users already aware of as yet undiagnosed breast
tumours among ET users the RRs were 1.19 and 1.09
lumps, or of suspect mammographic changes identi-
(p=0.3); among E+P users they were 2.00 and 1.66
fied before recruitment (see: Detection bias), could
have tended to overestimate the total duration of use.
The investigators concluded that “risks were sub-
Had women who already had breast cancer at base-
stantially greater among users of [E+P] than estrogen
line been excluded, that bias could largely have been
only formulations and if hormonal therapy started at
or around the time of menopause than later”.
A defect in the study design may also have facilitated
the occurrence of information bias. Ethinylestradiol
Evaluation of the MWS
(EE), listed as one of 34 memory-prompts in the ques-
Below we evaluate whether the evidence in the
tionnaire25 as an HRT preparation, is a synthetic estro-
MWS accorded with generally accepted principles of
gen present exclusively in oral contraceptives. Women
causality.1–4 The principles are inter-related, and when
who were aware of breast lumps at recruitment, or who
had suspect mammographic changes (see: Time order and detection bias), could erroneously have identified
Time order
EE as HRT. Soon after publication of the MWS report21
If allowance is made for the time from the diagnosis
the authors stated in an erratum that what was meant
of breast cancer to its recording in a registry, virtually
by ‘ethinylestradiol’ was ‘estradiol’.28 Yet the error was
all the cases identified at 0.1 years of follow-up (HRT:
not corrected in the second questionnaire,26 adminis-
RR, 1.37)22 or at 4 months (ET: RR, 1.19; E+P: RR,
tered 2–3 years after the first questionnaire.25
1.41),24 were already present when the women were recruited (see: Detection bias) and time order was
Detection bias
violated. In a properly designed cohort study breast
The design of a study of the risk of breast cancer in
cancers already present at baseline should have been
relation to the use of HRT in which the women were
recruited from a screening programme guaranteed that
Time order was further violated in respect of the
it would be biased. By definition, women who decided
timing and duration of HRT use. In the third report23
to have mammograms were alerted to the possibility
follow-up information on HRT use was unavailable for
of breast cancer, as has alsobeen acknowledgedin an
62% of the women. In the fourth report,24 by December
earlier study based on mammographic screening,29
2002 the follow-up questionnaire had been received by
and concern that HRT may cause the disease has been
about 66% of the women, among whom the response
widespread, and has increased over time. The MWS
rate was 65%. Hence follow-up information on HRT
invitation was explicit in the first questionnaire:25 “We
use [and on menopausal status (see: Detection bias) and
have a unique opportunity … to learn about the way
on confounders (see: Confounding)] was missingfor
different types of HRT … [affect] a woman’s health,
about 57% [1 – (±0.66×0.65)×100] of the women.
particularly her breasts”. That wording ensured that
Following publication of the WHI findings7 there was a
HRT users already aware of breast lumps, or of sus-
rapid and marked decline in the use of HRT.27 For that
pected breast cancer, would selectively participate
reason, as well as for other reasons (e.g. HRT-induced
breakthrough bleeding),7 since 66% of ever-users of
There was quantitative evidence of detection bias.
HRT at baseline were current users [our calculation:
First, HRT users were selectively enrolled: 32% of the
derived from Figure 1 (current use) and Figure 2 (past
women who participated and 19% or those who did
use) in Reference 21], a substantial proportion could
not were HRT users.30 Second, the data suggested that
have become past users by the end of 2002.
women already aware of breast lumps, or of suspected
How unreliable were the data? Recruitment com-
breast cancer, tended selectively to participate (see:
menced in 1996 and follow-up ended in December
Time order): whereas the incidence of breast cancer
2002.24 For about 50% of the women the time from
in the MWS population was 2.8 per 1000 WY,31 in the
last contact to diagnosis was >1.2 years,21 and to the
population at large it was 2.0 per 1000 WY.21 Third,
end of follow-up >2.7 years.23 For women enrolled
the baseline RRs of 1.3722 or 1.4124 (‘screen-detected’
in 1996 that interval could have been as much as 6
breast cancer) indicated that women who bothused
years. Thus it is likely that much of what was defined
HRT and whowere also aware of breast lumps, or of
in the analysis as current HRT use became past use
suspect lesions, or of suggestive precancerous changes
during follow-up. In addition, the duration data were
identified in earlier mammograms, were the most likely
incorrect (see: Duration-response), as were the data
to participate. Fourth, the average time from recruit-
on menopausal status and confounding (see: Detection
ment to breast cancer diagnosis was 1.2 years,21 and
1.7 years thereafter the RR of fatal breast cancer was
Shapiro S, Farmer RDT, Stevenson JC, et al.Family Planning (2011). doi: 1136/jfprhc-2011-100229
1.22. An increased risk of fatal cancer among HRT
out. Hence, it was to be expected that detection bias
users within 2.9 (1.2+1.7) years of recruitment was
would be greater for E+P users than for ET users.
not plausible (see: Biological plausibility), and it could
The RRs for invasive lobular and tubular tumours
have been due to the selective enrolment of HRT users
were higher than for ductal tumours.23 Lobular and
with pre-existing suspected or diagnosed breast cancer.
tubular tumours are more highly differentiated, smaller,
Fifth, the RRs declined with increasing BMI,24 a known
and more slow-growing than ductal tumours,34 35 and
risk factor for breast cancer in postmenopausal women
the detection of lobular tumours by mammography is
(see: Biological plausibility), and the larger the breasts,
also more difficult.36 More intensive scrutiny of mam-
the less likely was it that otherwise occult breast can-
mograms of HRT users than of non-users could have
cer would selectively have been detected among HRT
resulted in the selective detection of lobular tumours,
especially in radiologically dense mammograms, that
Detection bias could also have occurred during fol-
might otherwise have gone undetected.
low-up, as previously described in our critique of the
The RR for in situ breast cancer was 1.55.23 In situ
CR.6 Briefly, HRT users are advised to have regular
tumours are seldom clinically detectible, usually they
breast examinations and mammograms, and in the
are identified by mammography, and the investiga-
MWS users more frequently underwent mammog-
tors acknowledged that detection bias was likely (see:
raphy than did non-users;30 when mammograms are
Detection bias). Yet in the fourth report24 in situ and
performed HRT use is routinely recorded, and about
invasive breast cancers were considered together. In that
30% of breast cancers actually present go undetected;32
report the RRs were higher if HRT had commenced
about 5% of postmenopausal women have ‘clinically
within 5 years of the menopause than subsequently.
silent’ breast cancer;33 and HRT diminishes the sensi-
However, since the data for in situ breast cancer were
tivity of mammography.32 The mammograms of HRT
biased, the combination of in situ and invasive breast
users could have been more intensively scrutinised than
cancer was also biased. In addition, most of the women
those of non-users, especially if they were radiologi-
who were premenopausal at recruitment would have
cally dense, and otherwise occult breast cancer could
reached the menopause during follow-up, among the
selectively have been detected among the users.
57% of women not followed that information was
For both ET users and E+P users the RRs were
missing, and there was substantial misclassification of
lower during the first 4 months of follow-up than sub-
menopausal status, and of the time since menopause
sequently.24 The investigators stated that “it has been
suggested that part of the increased hormone thera-
The RRs declined with increasing tumour grade,
py-associated risk … observed in this study may have
and were higher for ER-positive than for ER-negative
resulted from the selective recruitment of hormone
tumours, and for node-positive than for node-neg-
therapy users who already had symptoms of breast
ative tumours.24 As shown in Table 1 (our calcula-
cancer. If that had happened there would have been
tions: derived from Figures 1 and 3 in Reference 24)
a greater hormone therapy-associated excess of breast
unknown values for tumour grade, ER status and nodal
cancer soon after recruitment than subsequently.
status among current users of ET and E+P, and among
However, the opposite was found”. They argued that
never-users of HRT, ranged from 49.5% to 74.1%.
these findings “largely [reflected] the lower hormone
Such high rates cast doubt on the validity of the evi-
therapy-associated risks observed for screen-detected
dence. In addition, the declining RRs with increasing
breast cancers than for non-screen-detected breast can-
tumour grade could have been biased if more com-
cers”. That claim ignored the likelihood that during
mon use of mammography by HRT users than by non-
follow-up HRT users could more commonly have had
users resulted in the selective detection of low-grade
repeat mammograms than non-users (see: Duration-
tumours; an association with ER-positivity could have
response), and because of a further defect in the study
occurred if breast cancers in HRT users were more
design that possibility could not be assessed:informa-
commonly tested, and if ER-positive, more commonly
tion on repeat mammograms was not solicited in the
documented in the registries; and the higher RRs for
second questionnaire26 (see: Confounding).
node-positive than for node-negative tumours could
There was further evidence to suggest detection bias.
readily have been due to detection bias.
The RRs were consistently lower for ET users than for
How much bias would it have taken to account for
E+P users.21–24 Unopposed ET causes uterine cancer,
the findings? In the first report21 the investigators esti-
and ET is preferentially prescribed to hysterectomised
mated that among women aged 50–64 years the use of
women, among whom vaginal bleeding does not
HRT would result in “five to six extra cancers per 1000
occur. By contrast, E+P is preferentially prescribed
women with 5 years’ use and 15–19 … per 1000 with
to women with a uterus, among whom breakthrough
10 years’ use”. Thus if detection bias resulted in the
bleeding is common;7 and bleeding makes it manda-
identification of 1–1.2 (5–6/5) otherwise occult cases
tory to rule out endometrial cancer. HRT users alerted
each year among 1000 women exposed for 5 years,
to the risk of that cancer would have become worried
or 1.5–1.9 (15–19/10) cases each year among women
about breast cancer as well, and have sought to rule it
exposed for 10 years, that bias would have nullified
Shapiro S, Farmer RDT, Stevenson JC, et al.Family Planning (2011). doi: 1136/jfprhc-2011-100229
it may be reasonable to judge that it might perhaps be
Table 1 Breast cancer in the Million Women Study:
reduced, but not be obliterated, even if it were possible
percentages of unknown values for tumour grade,
to entirely eliminate all sources of bias and confound-
estrogen receptor status and nodal status among current users of estrogen therapy, current users
ing. But if an association is small it may be impossible
of estrogen plus progestogen and never-users of
to judge. In the latter circumstance ‘statistical signifi-
cance’ may not equate with causality: given a massive amount of data, all that may be accomplished is to rule
out chance as one possible explanation, but not bias or
In the four reports the highest overall RR for HRT
users was 1.74,23 and RRs in excess of 2.0 were identi-
fied only in subgroups. For ET users the overall RR
was 1.30, and <2.00 in all subgroups. Such small RRs
could have been due to bias or confounding. For E+P
users the overall RR of 2.00 was again small,21 but
*Derived from Figure 1 in Reference 24.
significantly higher than the estimate of 1.30 for ET
†Derived from Figure 3 in Reference 24.
(p<0.0001). Or put another way, the RR for E+P ver-
E+P, estrogen plus progestogen; ER, estrogen receptor; ET, estrogen
sus ET was 1.54 (2.00/1.30). Such a small association
therapy; HRT, hormone replacement therapy.
could readily have been biased or confounded (see: Detection bias and confounding), illustrating how in a
the findings. In the second report,22 among women
massive study, virtually any deviation of RR from 1.0,
aged 50–55 years the respective absolute risks for ET
no matter how small, can yield a p value of <0.0001.
or E+P use for 5 years were estimated to be 1.5 and
Dose/duration-response
6.0 per 1000. That is, if detection bias resulted in the
Under a promotional hypothesis it might reasonably be
identification of 0.3 (1.5/5) additional cases in ET users
expected that the use of HRT would confer a greater
each year, or 1.2 (6.0/5) additional cases in E+P users,
risk of breast cancer, the higher the dose or the longer
that bias would have nullified the findings. Absolute
the duration of use (see: Biological plausibility).
risks ranging from 0.3 to 1.9 per 1000 women per year could plausibly have been due to detection bias. Dose-response Dose-response was not analysed. Confounding Confounding was incompletely controlled. In the Duration-response
first,21 second22 and third23 reports the factors allowed
In the first report,21 for women who were usingHRT
for included age, time since menopause, parity, age
at baseline (defined in the MWS as current users) the
at first birth, family history, BMI, region and socio-
total duration of use of all episodes of use, current
economic status. In the fourth report24 age at men-
plus past,was analysed. That analysis was incorrect.
opause and alcohol consumption were also allowed
Since the RR approximated unity within 2 years of
for. During follow-up factors such as menopausal
stopping,24 the duration of past use was irrelevant, and
status, time since menopause, age at menopause and
only the duration of the current episode of use should
BMI changed, and for about 57–62% of the women
have been analysed. In addition, the analysis of dura-
the information was missing (see: Time order). In
tion of use, as represented at baseline, misrepresented
addition, information on the receipt of a mammo-
the actual duration of use, since follow-up information
gram during follow-up was not solicited in the second
was missing for 62% of the women (see: Time order).
questionnaire26 (see: Detection bias).
A further defect in the study design made it impos-
Statistical stability and strength of association
sible to analyse the duration of current HRT use among
In our critique6 of the CR5 we alluded to the relation-
women who used more than one product. In the base-
ship between the statistical stability and strength of
line questionnaire25 five relevant questions were asked:
any given association: if a RR is ‘large’ (say >5.0), a
“32. Have you ever used [HRT]?”; “35. For about how
95% CI that excludes 1.0 (i.e. a ‘statistically signifi-
many years in total have you used HRT?”; “36. Are you
cant’ association) can be documented in a relatively
now using HRT?”; “37. What is the name of the most
small study. But if a RR is ‘small’ (say <2.0), usually
recent HRT you have used?”; and “38. For how many
it can only be documented in a massive study. The
years did you use the most recent type of HRT?”.
difficulty however, is that “if a massive study is suf-
Based on questions 32 and 35, among current HRT
ficiently massive, any deviation of the RR from 1.0, no
users at baseline who used more than one product the
matter how small, becomes ‘significant’”; but it may
total duration of ever-use could be analysed, but based
be impossible to discriminate among bias, confound-
on questions 36, 37 and 38 the duration of current
ing and causation as alternative explanations. By con-
HRTuse could not be. To illustrate, consider a current
trast, “in a well-conducted study, when a RR is large,
HRT user who at baseline had used E+P for 9 years,
Shapiro S, Farmer RDT, Stevenson JC, et al.Family Planning (2011). doi: 1136/jfprhc-2011-100229
and following hysterectomy, ET for 1 year: the current
effects of estrogens on estrogen-sensitive cells,37
1-year duration of ET use would have been recorded,
or the excessive metabolism of estrogens to highly
but the current 10-year duration of HRT use (E+P, 9
active compounds38 with strong proliferative as
years + ET, 1 year) would not have been.
well as possibly genotoxic effects. However, estro-
In the second report22 duration data were not given.
gens also have antiproliferative and pro-apoptotic
In the third report23 “there was no significant differ-
effects,39 which could possibly reduce the risk of
ence in the trends in [RR] with duration of use of
breast cancer. In addition, estrogens can be metabo-
either type of hormone therapy [ET or E+P] for ductal,
lised not only to potentially genotoxic metabolites,
tubular or lobular cancer”. In the fourth report24 the
but also to carcino-protective metabolites, such as
RRs for ≥5 years of use of ET and of E+P at baseline
were higher than for <5 years of use, and higher for
In short, some mechanisms could possibly increase
E+P than ET users. Those differences could have been
the risk of breast cancer in HRT users, and other mecha-
due to detection bias; trends according to durations
nisms could decrease it. However, under a promotional
of <1, 1–4, 5–9 and ≥10 years were not presented;
hypothesis, for the most aggressively multiplying cells it
‘total duration’ of use again referred to all episodes
is generally accepted that on average it takes at least 10
of use, not to current use; and the duration data were
years to attain a tumour diameter of about 1 cm, which
again misclassified, because of follow-up information
is about the smallest lesion that can be diagnosed clini-
on HRT use was missing for 57% of the women.
cally.38 In the MWS the average total duration of HRT
Finally, the RRs for increasing duration of follow-
use at baseline was 6.1–6.9 years,22 and the duration of
up were inconsistent with a duration-response effect
current use would have been appreciably less. Since the
(see: Internal consistency). Among women who used
RR approximated unity within 2 years of discontinuing
HRT, ET or E+P at baseline the RRs were highest at
HRT use, among current users of HRT the duration of
0.7 years of follow-up, after which they declined.22
past use cannot have had any effect. It is implausible
Yet under causal assumptions, the longer the duration
that the current use of HRT at baselinefor less than 6
of follow-up, the higher should the RRs have been. A
years could have increased the risk of breast cancer. It
plausible explanation of these inconsistent findings is
is also implausible that cancer cells, once already pro-
that violation of time orderanddetection bias could
moted, and once already invasive, could have ‘unpro-
have been greatest during the first year of follow-up.
moted’ within 2 years of stopping HRT.24
Obesity is a risk factor for breast cancer in postmen-
Internal consistency
opausal women, perhaps because of increased endog-
As described above, the RRs according to duration of
enous estrogen secretion,37 and the RRs declined with
follow-up were inconsistent (see: Duration-response).
increasing BMI.24 Under a causal hypothesis, however, although obesity itself increases the risk of breast can-
External consistency
cer, the RRs among HRT users should have been higher
For ET users the MWS findings were inconsistent with
than among non-users within strata of BMI, and the
those of the WHI clinical trial15 16 in which the evi-
decline in the RR was explicable by the diminished
dence suggested that unopposed ET does not increase
sensitivity of mammographic screening with increas-
the risk of breast cancer. In the MWS there was quan-
titative evidence of bias, whereas in the WHI trial women were randomly assigned, ‘double-blind’, to ET
Conclusions
or placebo, all participants were hysterectomised, vagi-
The name ‘Million Women Study’ implies an authority
nal bleeding did not occur, ‘unblinding’ was seldom
beyond criticism or refutation. Many commentators,
necessary, the ‘unblinding’ rate was <2.0%, and there
and the investigators, have repeatedly stressed that it
was the largest study of HRT and breast cancer ever
For E+P users the MWS findings were inconsistent
conducted. Yet the validity of any study is dependent
with those of the CR. In the MWS the RRs approxi-
on the quality of its design, execution, analysis and
mated unity within 2 years of stopping HRT;24 in
interpretation. Size alone does not guarantee that the
the CR the RR only declined to unity 5 years after
findings are reliable. The MWS was an observational
study, and it had the attendant problems and uncer-
Biological plausibility
tainties intrinsic to such studies. If the evidence was
Elsewhere we have considered relevant pathological
unreliable, the only effect of its massive size would
and experimental evidence for and against the possi-
have been to confer spurious statistical authority to
bility that HRT may cause breast cancer.6 19 20 Briefly,
the hypothesis is not that HRT causes genetic muta-
Here we conclude that the evidence in the MWS
tion (initiation), but that estrogens, and probably
was indeed unreliable. There were defects in the study
progestogens as well, accelerate the proliferation
design, and the findings did not adequately satisfy the
of otherwise slowly growing malignant cells (pro-
principles of causation. In terms of time order, infor-
motion). Possible mechanisms are the proliferative
mation bias, detection bias, confounding, statistical
Shapiro S, Farmer RDT, Stevenson JC, et al.Family Planning (2011). doi: 1136/jfprhc-2011-100229
stability and strength of association, dose/duration-
14 PrenticeRL, Manson JE, Langer RD, et al. Benefits and risks of
postmenopausal hormone therapy when it is initiated soon after
response, internal consistency, external consistency
menopause. Am J Epidemiol 2009;170:12–23.
and biological plausibility the study was defective.
15 The Women’s Health Initiative Steering Committee. Effect of
HRT may or may not increase the risk of breast can-
conjugated equine estrogen in postmenopausal women with
cer, but the MWS did not establish that it does.
hysterectomy: the Women’s Health Initiative randomized controlled trial. JAMA 2004;291:1701–1712. Acknowledgement The authors thank Helen Seaman
16 StefanickML, Anderson GL, Margolis KL, et al. Effects
of conjugated equine estrogens on breast cancer and
Competing interests Samuel Shapiro, John Stevenson,
mammography screening in postmenopausal women with
Henry Burger, and Alfred Mueck presently consult,
hysterectomy. JAMA 2006;295:1647–1657.
and in the past have consulted, with manufacturers
17 LaCroixAZ, Chlebowski RT, Manson JE, et al. Health
of products discussed in this article. Richard Farmer
outcomes after stopping conjugated equine estrogens among
has consulted with manufacturers in the past.
postmenopausal women with prior hysterectomy: a randomized controlled trial. JAMA 2011;305:1305–1314. Provenance and peer review Not commissioned;
18 PrenticeRL, Chlebowski RT, Stefanick ML, et al. Conjugated
equine estrogens and breast cancer risk in the Women’s Health Initiative clinical trial and observational study. Am J EpidemiolReferences
2008;167:1407–1415.
1 HillAB. The environment and disease: association or causation?
19 Shapiro S, Farmer RDT, Mueck AO, et al. Does hormone Proc R Soc Med 1965;58:295–300.
replacement therapy cause breast cancer? An application of
2 Susser M. Causal Thinking in the Health Sciences. New York,
causal principles to three studies. Part 2. The Women’s Health
Initiative: estrogen plus progestogen. J Fam Plann Repred Health
3 US Depart of Health, Education and Welfare: Public Health Care 2011;37:165–172. Service.Smoking and Health: Report of the Advisory Committee
20 Shapiro S, Farmer RDT, Seaman H, et al. Does hormone to the Surgeon General of the Public Health Service (Public
replacement therapy cause breast cancer? An application of
Health Service Publication No. 11030). Washington, DC: US
causal principles to three studies. Part 3. The Women’s Health
Initiative: unopposed estrogen. J Fam Plann Repred Health Care
4 SusserM. What is a cause and how do we know one? A grammar
2011;37:225–230.
for pragmatic epidemiology. Am J Epidemiol 1991;133:635–648.
21 Million Women Study Collaborators. Breast cancer and
5 Collaborative Group on Hormonal factors in Breast Cancer.
hormone replacement therapy in the Million Women Study.
Breast cancer and hormone replacement therapy: collaborative
Lancet 2003;362:419–427.
reanalysis of data from 51 epidemiological studies of 52 705
22 Beral V, Banks E, Reeves G, et al. The effect of hormone
women with breast cancer and 108 411 women without breast
therapy on breast and other cancers. In: Critchley H, Gebbie A,
cancer. Lancet 1997;350:1047–1059.
Beral V (eds). Menopause and Hormone Replacement. London:
6 ShapiroS, Farmer RD, Seaman H, et al. Does hormone
replacement therapy cause breast cancer? An application of
23 ReevesGK, Beral V, Green J, et al. Hormonal therapy for
causal principles to three studies: Part 1. The Collaborative
menopause and breast-cancer risk by histological type: a cohort
Reanalysis. J Fam Plann Reprod Health Care 2011;37:103–109.
study and meta-analysis. Lancet Oncol 2006;7:910–918.
7 Writing Group for the Women’s Health Initiative
24 BeralV, Reeves G, Bull D, et al. Breast cancer risk in relation to Investigators. Risks and benefits of estrogen plus progestin
the interval between menopause and starting hormone therapy.
in healthy postmenopausal women. Principal results from the
J Natl Cancer Inst 2011;103:1–10.
Women’s Health Initiative randomized controlled trial. JAMA
25 The Million Women Study: A National Survey of Women
2002;288:321–333.
Invited for Breast Screening. First questionnaire. http://
8 ChlebowskiRT, Hendrix SL, Langer RD, et al. Influence of
www.millionwomenstudy.org/files/mws-web1.pdf [accessed
estrogen plus progestin on breast cancer and mammography in
healthy postmenopausal women: the Women’s Health Initiative
26 The Million Women Study: Confidential National Study
Randomized Trial. JAMA 2003;289:3243–3253.
of Women’s Health. Second questionnaire. http://www.
9 AndersonGL, Chlebowski RT, Rossouw JE, et al. Prior
millionwomenstudy.org/files/mws-web2.pdf [accessed
hormone therapy and breast cancer risk in the Women’s Health
Initiative randomized trial of estrogen plus progestin. Maturitas
27. GompelA, Plu-Bureau G. Is the decrease in breast cancer
2006;55:103–115.
incidence related to a decrease in postmenopausal hormone
10 HeissG, Wallace R, Anderson GL, et al. Health risks and
therapy? Ann N Y Acad Sci 2010;1205:268–276.
benefits 3 years after stopping randomized treatment with
28 Million Women Study Collaborators. Errata. Lancet
estrogen and progestin. JAMA 2008;299:1036–1045.
2003;362:1160.
11 ChlebowskiRT, Anderson GL, Gass M, et al. Estrogen
29 MorrisonAS, Brisson J, Khalid N. Breast cancer incidence and
plus progestin and breast cancer incidence and mortality in
mortality in the breast cancer detection demonstration project.
postmenopausal women. JAMA 2010;304:1684–1692. J Natl Cancer Inst 1988;80:1540–1547.
12 PrenticeRL, Chlebowski RT, Stefanick ML, et al. Estrogen plus
30 Banks E, Beral V, Cameron R, et al. Comparison of various
progestin therapy and breast cancer in recently postmenopausal
characteristics of women who do and do not attend for breast
women. Am J Epidemiol 2008;167:1207–1216.
cancer screening. Breast Cancer Res 2002;4:R1.
13 ChlebowskiRT, Kuller LH, Prentice RL, et al. Breast cancer
31 ShapiroS. The Million Women Study: potential biases do
after use of estrogen plus progestin in postmenopausal women.
not allow uncritical acceptance of the data. ClimactericN Engl J Med 2009;360:573–587.
2004;7:3–7.
Shapiro S, Farmer RDT, Stevenson JC, et al.Family Planning (2011). doi: 1136/jfprhc-2011-100229
32 GreendaleGA, Reboussin BA, Sie A, et al. Effects of estrogen
disease at diagnosis in 184 patients. AJR Am J Roentgenol
and estrogen-progestin on mammographic parenchymal
1993;161:957–960.
density. Postmenopausal Estrogen/Progestin Interventions
37 International Agency for Research on Cancer (IARC).
(PEPI) Investigators. Ann Intern Med 1999;130:262–269. IARC Monographs on the Evaluation of Carcinogenic Risks
33 WelchHG, Black WC. Using autopsy series to estimate the to Humans (Volume 72): Hormonal Contraception and
disease “reservoir” for ductal carcinoma in situ of the breast:
Postmenopausal Hormone Therapy. Lyon, France: IARC,
how much more breast cancer can we find? Ann Intern Med
1997;127:1023–1028.
38 SeegerH, Wallwiener D, Kraemer E, et al. Comparison
34 LiCI, Moe RE, Daling JR. Risk of mortality by histologic type
of possible carcinogenic estradiol metabolites: effects on
of breast cancer among women aged 50 to 79 years. Arch Intern
proliferation, apoptosis and metastasis of human breast cancer
Med 2003;163:2149–2153.
cells. Maturitas 2006;54:72–77.
35 PorterPL, El-Bastawissi AY, Mandelson MT, et al. Breast tumor
39 DietelM, Lewis MA, Shapiro S. Hormone replacement
characteristics as predictors of mammographic detection:
therapy: pathobiological aspects of hormone-sensitive cancers
comparison of interval- and screen-detected cancers. J Natl
in women relevant to epidemiological studies on HRT: a mini-
Cancer Inst 1999;91:2020–2028.
review. Hum Reprod 2005;20:2052–2060.
36 KreckeKN, Gisvold JJ. Invasive lobular carcinoma
40 Seeger MH. 2-Methoxyestradiol – biology and mechanism of
of the breast: mammographic findings and extent of
action. Steroids 2010;75:625–631.
Shapiro S, Farmer RDT, Stevenson JC, et al.Family Planning (2011). doi: 1136/jfprhc-2011-100229
Roadmap for Pediatric Dermatology at the SID **Clinical Scholars program and Young Investigator and Trainee symposium are only open to those who sign up in advance (free). If you have not, we encourage you to sign up while there are still spaces. Anyone interested in meeting Weds. evening at the Welcome Reception: Gather 7:30-8:00 pm at the statue of Sir Walter Raleigh on the patio (unless r
Beerenleitner lacht los. Es zerreißt ihn fast. Er muss sich schnäuzen. Filipowicz hat Humor. Das muss man ihm lassen. „Bildung“, erklärt er, „ist so wie Ihre Krawattennadel: schön, Wien, Innere Stadt. Oktober 2006 Redakteur Beerenleitner und Filipowicz sitzen im Café Bräu-nerhof und rühren beide in einer Mokkatasse. Vor 37 Jahren, 1974, hatte Beerenleitner diesen Adam Filipo-