|
|
|||||||||
|
Persons using assistive technology might not be able to fully access information in this file. For assistance, please send e-mail to: mmwrq@cdc.gov. Type 508 Accommodation and the title of the report in the subject line of e-mail. System To Generate Semisynthetic Data Sets of Outbreak Clusters for Evaluation of Outbreak-Detection PerformanceChristopher A. Cassa,1,2 K.
Olson,1,3 K. Mandl1,3
Corresponding author: Christopher Cassa, Massachusetts Institute of Technology, 77 Massachusetts Ave., Rm. E25-519, Cambridge, MA 02139. Telephone: 617-355-2930; Fax: 617-730-0921; E-mail: cassa@mit.edu. AbstractIntroduction: The outbreak detection performance of a syndromic surveillance system can be measured in terms of its ability to detect signal (disease outbreak) against background noise (normal variation of baseline disease within a region). However, because a limited number of persons have been infected with agents of biologic terrorism, such data are virtually nonexistent. Therefore, simulation is necessary. One approach to evaluation is to present detection algorithms with semisynthetic data sets. These data sets contain simulated signal superimposed on real background noise. Objectives: The Children's Hospital Informatics Program (CHIP) Cluster Generator automates the creation of spatio-temporal patient cluster data to help evaluate epidemic-detection software. The spatio-temporal data can then be used to analyze the sensitivity and specificity of spatial or temporal detection algorithms. Methods: A software tool (available at http://www.chip.org/biosurv/resources.htm) was created to generate artificial outbreaks of spatially clustered cases and inject them into background noise. Each cluster is defined by a controlled feature set. Parameters (e.g., outbreak magnitude, duration, temporal progression, and location) can be varied by the user. Results: The open-source program accepts a valid set of patient test cluster parameters and creates geospatial patient test data for a single cluster or a series of clusters. The tool automates the creation of valid patient data sets for rigorous testing of outbreak-detection algorithms. The tool outputs either single-patient clusters or series of patient clusters as files containing patient longitude and latitude coordinates. When used with geographic information system software, these clusters can be displayed on a map (Figure). In testing, all generated clusters were properly created within the parameters set at program execution. The cluster generator is in use for rigorous testing of outbreak-detection algorithms. Conclusions: Automated generation of semisynthetic data sets facilitates evaluation of public health surveillance systems for early detection of outbreaks. Figure Return to top.
Disclaimer All MMWR HTML versions of articles are electronic conversions from ASCII text into HTML. This conversion may have resulted in character translation or format errors in the HTML version. Users should not rely on this HTML document, but are referred to the electronic PDF version and/or the original MMWR paper copy for the official text, figures, and tables. An original paper copy of this issue can be obtained from the Superintendent of Documents, U.S. Government Printing Office (GPO), Washington, DC 20402-9371; telephone: (202) 512-1800. Contact GPO for current prices. **Questions or messages regarding errors in formatting should be addressed to mmwrq@cdc.gov.Page converted: 9/14/2004 |
|||||||||
This page last reviewed 9/14/2004
|