At a glance
Data validation process
To have their data included in CDC's published ART success rates, clinics must submit their data to the National ART Surveillance System (NASS) by the deadline announced by CDC each year. Clinic medical directors must verify by signature that the generated clinic tables are accurate. The data are then reviewed, and clinics are contacted if corrections are necessary. After the data have been verified, a quality control process called validation begins.
In recent years, about 35 reporting clinics (7% to 10% of all reporting clinics) have been selected for validation using a stratified random sampling approach. With this method, clinics are stratified based on total annual cycle count, with larger clinics having greater chance of selection. In most cases, clinics that were selected for validation during the past 3 years are excluded from selection. However, some clinics may be designated for return validation based on previous validation results.
Validation usually occurs from May to June. During the annual validation process, the validation team meets with selected clinics and reviews patient medical record data for a sample of the clinic's ART cycles. The information collected is then compared with the data submitted to NASS.
For each clinic, the fully validated sample includes up to 40 cycles resulting in pregnancy, up to 20 cycles not resulting in pregnancy, and up to 10 cycles using donor eggs or embryos. In addition, up to 10 fertility preservation banking cycles are selected from each clinic for partial validation. For each patient included in the validation sample, the total number of cycles reported is compared with the total number of cycles in the medical record. If unreported ART cycles are identified in selected medical records, up to 10 of these cycles are also selected for partial validation. An equal number of cycles are selected from all validated clinics, regardless of their size. This approach is needed to obtain sufficient sample sizes from clinics to calculate clinic-specific and overall discrepancy rates.
The data validation process does not include any assessment of clinical practice or overall recordkeeping. Validation primarily helps ensure that clinics submit accurate data. It also serves to identify any systematic problems that could cause data collection to be inconsistent or incomplete.
Discrepancy rates for selected data fields
For 2022 reporting year data validation, 35 of the 457 reporting clinics were randomly selected after taking into consideration the number of ART cycles performed at each clinic, some cycle and clinic characteristics, and whether the clinic had been selected before. To calculate discrepancy rates, 2,073 ART cycles across the 35 randomly selected clinics were randomly selected for full validation, along with 301 fertility preservation banking cycles selected for partial validation. Discrepancy rates for the validated data fields are presented in the table below.
Data Field Name | Discrepancy Ratesa (95% Confidence Interval) |
Patient date of birth | 0.6% (0.1, 2.1) |
Cycle intention | 0.4% (0.1, 1.4) |
Cycle start date | 0.3% (0.0, 1.4) |
Date of egg retrieval | 0.1% (0.0, 0.4) |
Number of embryos transferred | 0.1% (0.0, 0.3) |
Outcome of ART treatment (pregnant or not pregnant) | 0.1% (0.0, 0.8) |
Pregnancy outcome (such as miscarriage, live birth, or stillbirth) | 0.2% (0.0, 0.7) |
Date of pregnancy outcome | 0.4% (0.2, 1.0) |
Number of infants born | 0.0% (0.0, 0.2) |
Cycle count | 0.2% (0.1, 0.7) |
Patient diagnosis—reason for ART | |
Tubal factor | 0.2% (0.1, 0.7) |
Ovulatory dysfunctionb | 2.1% (0.7, 5.9) |
Diminished ovarian reservec | 1.3% (0.6, 2.7) |
Endometriosis | 0.5% (0.2, 1.2) |
Uterine factor | 0.4% (0.1, 1.4) |
Male factor | 0.5% (0.2, 1.1) |
Other factorsd | 2.0% (0.9, 4.1) |
Unknown factorse | 1.3% (0.5, 3.3) |
a Discrepancy rates estimate the proportion of all ART cycles with differences in reported values and values recorded in the medical record for a particular data field. Discrepancy rate calculations weight the data from validated cycles to reflect the overall number of cycles performed at each clinic. Thus, findings from larger clinics were weighted more heavily than those from smaller clinics.
b The data field “Ovulatory Dysfunction” was overreported. For 60% of discrepancies, the data field “Ovulatory Dysfunction” was reported by the clinic, but was not found in the medical records.
c The data field “Diminished Ovarian Reserve” was underreported. For 84% of discrepancies, the data field “Diminished Ovarian Reserve” was found in medical records, but was not reported by the clinic.
d The data field “Other Factors” was underreported. For 61% of discrepancies, the data field “Other Factors” was found in the medical records, but was not reported by the clinic.
e The data field “Unknown Factors” was underreported. For 74% of discrepancies, the data field “Unknown Factors” was found in the medical records, but was not reported by the clinic.
Please contact us with any questions or suggestions at ARTinfo@cdc.gov.