MethodologyCluster detection methods applied to the Upper Cape Cod cancer dataAl Ozonoff1 , Thomas Webster2 , Veronica Vieira2 , Janice Weinberg1 , David Ozonoff2 and Ann Aschengrau3  1
Department of Biostatistics, Boston University School of Public Health, 715 Albany Street, Boston, MA 02118, USA 2
Department of Environmental Health, Boston University School of Public Health, 715 Albany Street, Boston, MA 02118, USA 3
Department of Epidemiology, Boston University School of Public Health, 715 Albany Street, Boston, MA 02118, USA author email corresponding author email
Environmental Health: A Global Access Science Source 2005,
4:19doi:10.1186/1476-069X-4-19
|
| Published: |
15 September 2005 |
Abstract
Background
A variety of statistical methods have been suggested to assess the degree and/or the location of spatial clustering of disease cases. However, there is relatively little in the literature devoted to comparison and critique of different methods. Most of the available comparative studies rely on simulated data rather than real data sets.
Methods
We have chosen three methods currently used for examining spatial disease patterns: the M-statistic of Bonetti and Pagano; the Generalized Additive Model (GAM) method as applied by Webster; and Kulldorff's spatial scan statistic. We apply these statistics to analyze breast cancer data from the Upper Cape Cancer Incidence Study using three different latency assumptions.
Results
The three different latency assumptions produced three different spatial patterns of cases and controls. For 20 year latency, all three methods generally concur. However, for 15 year latency and no latency assumptions, the methods produce different results when testing for global clustering.
Conclusion
The comparative analyses of real data sets by different statistical methods provides insight into directions for further research. We suggest a research program designed around examining real data sets to guide focused investigation of relevant features using simulated data, for the purpose of understanding how to interpret statistical methods applied to epidemiological data with a spatial component. |