|
||||||||
|
Volume 2: No. 3, July 2005
ORIGINAL RESEARCH
|
|
[ View enlarged image and descriptive text ] Figure 2. Main sections of COMPURSE program for selecting parameters and for computing relative standard errors (RSEs), standard errors (SEs), and confidence intervals (CIs) for annual totals and average annual totals of multiple-year summaries. |
The second part of the COMPURSE program is a SAS program (version 6.12 or later) that searches for the appropriate parameter from the transformed tables and calculates the corresponding point estimates, SEs, and CIs by year, type of statistic, hospital and demographic category, and characteristics or groups (Figure 2). Depending on year, the program distinguishes the kind of RSE parameters (percentages or function coefficients) and the characteristics with specific parameters from those without specific parameters (e.g., “ALLOTHER” before 1988 or “TOTAL” during or after 1988).
COMPURSE merges user-specified data and the corresponding RSE parameter tables to look up specific values in the parameter tables. If survey year, type of statistic, category, and characteristics (group) within a category for the user-specified data agree with those from the corresponding RSE parameter tables, the program then selects the corresponding pair of parameters from the parameter tables. Before 1988, the COMPURSE program linearly interpolates between the RSE percentage values corresponding to the listed estimates above and below the weighted estimate (ESTINUM) the user specifies (5). During or after 1988, COMPURSE immediately calculates these intervals from the function coefficients selected during the table lookup for the weighted estimate (ESTINUM) the user specifies (5).
For annual totals with specified characteristics, COMPURSE can output the number, rate, and percentage of hospital discharges with their corresponding SEs and CIs (Appendix A). COMPURSE also provides another option to compute average annual totals for multiple years and their SEs and CIs (based on the third set of transposed parameter tables for years before 1988 or the function coefficients for 1988 and thereafter). The methods for computing these latter multiple-year averages are described in the NCHS documentation for the NHDS 1979–2000 data (5).
|
[ View enlarged image and descriptive text ] Figure 3. Main components of the user interface program for COMPURSE. |
The third part of the program, the user interface, allows the user to define the time period for multiple-year summaries, to supply the normal deviate corresponding to the significance level for the CIs, to choose units for expressing the rate and the number of hospital discharges, and to provide other parts of the program with the location of files and the type of data input and output (Figure 3; Appendix B).
We tested the COMPURSE program with three data sets extracted from the NHDS, one from a publication (6) and the other two from projects on which the first author is working. The COMPURSE program was designed to perform a statistical analysis once for each disease or disease group. Analyzing multiple diseases in one program run requires the addition of SAS macro statements to the interface program. With the NHDS data cited (6), we tested the first 25 of 50 diseases in the year 2000. The estimates and their SEs computed with the COMPURSE program overall and by four age groups are compatible with those published by Hall and Owings (6) using SUDAAN software (Table 3). The COMPURSE results for annual and multiple-year summaries from 1988 to 2000 for arthritis and multiple-year summaries from 1979 to 2000 for epilepsy or seizure disorders are also compatible with manual computations (data not shown).
When reporting results from this program, the user should consider NCHS guidelines for reporting NHDS estimates. Because of the complex sample design of the NHDS, the NCHS recommends the following: 1) if an estimate is based on 29 or fewer unweighted sampled discharges, the value of the estimate should not be reported; 2) if this number is from 30 through 59, the value of the estimate may be reported but should not be considered reliable; 3) if this number is 60 or more, and if the RSE is less than 30%, the value of the estimate is reliable and may be reported; and 4) if the RSE of any estimate exceeds 30%, no matter what the number in the unweighted sample is, this estimate is unreliable and should not be reported. The NCHS further indicates that the user of the data should decide whether or not to report an estimate. However, if the user chooses to report an unreliable estimate, the user must inform the consumer (for example, a reader or a policy maker) that the estimate is unreliable (5).
If the overall number of hospital discharges for a disease of interest is small, the RSE may be relatively large. To reduce such large RSEs, the data analyst can aggregate multiple years of data to increase the number in the unweighted sample. However, such aggregation may defeat the purpose of the analysis (e.g., looking for time trends).
Finally, computations of RSEs, SEs, and CIs cannot be applied to subgroups that combine different demographic groups (e.g., white males, black females). Computations can only be applied to single-category groups such as only whites or only males (MF Owings, NCHS, written communication, May 2003).
COMPURSE was programmed based on the National Hospital Discharge Survey 1979–2000 Multi-Year Public-Use Data File Documentation (5). However, it can also be used for data after 2000 as long as the RSE parameter table for these years is transposed and added to the new RSE parameter tables. Because there are no error curves for NHDS data from 1965 to 1978, the COMPURSE program is unusable for data in these years. The 1979–2000 transposed parameter tables, the COMPURSE program, and the data interface program described in this article are available from the first author, who will update and transpose parameter tables issued by NCHS for years after 2000. The program will be updated and modified to account for any discrepancies found in the future. Users who identify problems with the program or incorrect results should contact the first author.
Many thanks go to Dr David Thurman and Dr Charles Helmick from the Centers for Disease Control and Prevention (CDC) for providing information and support; to Maria F. Owings from the NCHS for her review, discussion, and suggestions on the paper draft; and to Fredrick L. Hull, the CDC editor who improved the readability of this paper.
Corresponding Author: Yao-Hua Luo, PhD, Division of Adult and Community Health, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, 4770 Buford Hwy, Mail Stop K-51, Atlanta, GA 30341. Telephone: 770-488-5136. E-mail: ycl3@cdc.gov.
Author Affiliations: Matthew Zack, MD, Division of Adult and Community Health, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, Ga.
|
Table 2. Examples of Relative Standard Error (RSE) Parameter Tables Transformed by COMPURSE Program Using Data From National Hospital Discharge Survey, 1979–2000
Table 3. Calculations Using COMPURSE Compared With Calculations Using SUDAANa,b
|
|
|
AppendicesAppendix A. Calculating Annual Totals for a User-specified Weighted EstimateThe procedure and equations to calculate RSE, SE, and CI of annual total for a weighted estimate (ESTINUM) specified by the user are as follows:
where A1 and A2 represent the listed estimates in the RSE table which are at the low side and high side most adjacent to the weighted estimate (ESTINUM); a and b are RSE values corresponding to the two listed estimates; Ps is a ratio of the difference between the weighted estimate and the listed estimate in low side over that between listed estimates in the high side and the low side; SE_a1 and SE_a2 are SE for the listed estimates in the low side and high side, respectively, and SE is the standard error of the weighted estimate the user specified; RSE is relative standard error and CI is confidence interval; t-value is the t value at the given statistical level.
where a and b are coefficients listed in the RSE parameter tables, and the other components are the same as those in the previous equation. Appendix B. Program InstructionFor annual totals with specified given characteristics, COMPURSE can output the number, rate, and percentage of hospital discharges with their corresponding SEs and CIs. COMPURSE also provides another option to compute average annual totals for multiple years and their SEs and CIs (based on the third set of transposed parameter tables for years before 1988 or the function coefficients for 1988 and thereafter). The methods for computing these latter multiple-year averages are described in the NCHS documentation for the NHDS 1979–2000 data (5). The COMPURSE package includes three parts: 1) three sets of transposed parameter tables and the example reference table of variables (Table 2) for data preparation; 2) the COMPURSE program for parameter retrieval and confidence interval calculation (Figure 2); and 3) the interface program (Figure 3), which the user modifies to provide information to run the other two parts of the package. The user should first copy all three parts of the COMPURSE package to the user’s computer. Specifically, the user should save without changes the transformed parameter tables and the COMPURSE program in a directory on the computer hard drive. The user should save a copy of the interface program under a new name for the user’s current analysis and leave the original copy to be copied for next analysis. Second, the user should change this new copy of the interface program to define the computer directory path where the RSE parameter tables and the COMPURSE program for CI computations are located. In this new copy, the user should also change the statements in brackets, optionally typing in appropriate words, or the sentences ending in ellipses. For example, to set up yearly groups in value statement of PROC FORMAT, to modify for the new copy of the interface program, or to combine multiple values in variable characteristics into a new value (cf., Figure 3). These changes are the following:
Third, the user must specify which of three time periods should be analyzed: 1979–1987, 1988–2000, or 1979–2000. For multiple-year summaries, the time period specified should be the same for the hospitals sampled and for the data selected. The COMPURSE program will compute SEs and CIs for average annual totals for such summaries even if the time period spans the transition period 1987–1988. Fourth, the user can specify a confidence level for the Cl different from the default (95%) by typing an ampersand (&) and either of two other options, t90 (90% level) or t99 (99% level), after the macro variable, &tt. The user can also specify different options for saving the output results of hospital discharges and for changing the magnitude of rates by selecting denominators for the rate (DNR&[any of the listed names of the numbers]) or for the number (DNO&[any of the listed names of the numbers] or for both. Fifth, the user should input the extracted data, with the following restriction: the COMPURSE program processes the relevant estimates, their SEs, and their CIs only for one disease or disease group at a time; however, the user may write a macro in the user interface program to compute SEs and CIs for more than one disease or disease group at a time. The user’s extracted data should include both diseases and years of interest from the NHDS CD-ROM (1979–2000) and the annual weighted sample estimates of hospital discharges by specified characteristics in separate external files accessible by SAS. These data may be input through SAS and include two more variables. The first step is to determine what type of statistic the user is interested in — first-listed diagnosis, all-listed diagnoses, procedure, or days of care. The second is to define the category for the characteristics. The RSE parameter table (Table 2) will be helpful for selecting these types of statistic and characteristics. For example, if the user were interested in the first-listed diagnosis of hospitalizations among those aged >15 years, the value of the first variable for the type of statistic (OUTCOME) should be “FDX” (first-listed diagnosis), and the values of the second variable (characteristic for the type of statistic) for the category variable (CATE) should be “AGE.” Although the user can name variables in the input file in the desired way, the program uses a standard set of variable names. The user should type in the data variable names from the right side of the assignment statements for the type of statistic and their characteristics (the words in the brackets of the interface program, cf., Figure 3). The program uses the standard variable names on the left side of these assignment statements. For example, if the variable name representing the survey year in the user’s data is “YR,” the user should fill in the bracket on the right side of the assignment statement with YR. If the variable name representing the weighted estimate of hospitalizations in the user’s data is “weitnum,” the user should fill in the bracket on the right side of the assignment statement with WEITNUM. If the user does not want to compute a rate, the user can fill in the bracket with a period (SAS’s missing value indicator). The unweighted estimate (unweighted number of discharges) is named as “CASENUM” in the program. The user should calculate this number and include it on the input file. The computations of CIs are conducted only if the unweighted estimate is >30. If some characteristics have too few hospital discharges to be observed individually, they can be summarized as a new value under the variable name CHARACTE (characteristics). The new character value should not have the same name as any of the other values in the assignment statements mentioned previously. Finally, the user should remove all the brackets from the modified copy of the user interface program before running the program. After specifying the NCHS data set of interest and any of the previous options, the user can submit the copy of the user interface program through SAS. This interface program in turn calls the COMPURSE program through the %INCLUDE &COMPURSE statement to calculate the estimates and their RSEs, SEs, and CIs. |
|
|
|
|
|
The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors’ affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.
Privacy Policy | Accessibility This page last reviewed March 30, 2012
|
|