Produce Summary Statistics
Purpose
This module provides instruction on how to produce a formatted table showing the frequency distribution of race/ethnicity and gender by linkage eligibility and match status of NHANES participants.
Task: Produce Table of Race/Ethnicity and Gender by Linkage Eligibility and Match Status of NHANES Participants
To better understand the characteristics of NHANES participants who are linked to CMS data, you can produce descriptive statistics from the merged feasibility and NHANES files. Using example SAS code provided, this task will show you how to use the merged SAS dataset to assess different demographic characteristics (e.g., race/ethnicity and gender) and linkage eligibility and match status.
Once the feasibility files have been merged to the NHANES Demographic File, you can use the merged files to assess sample sizes by different demographic characteristics (e.g., race/ethnicity and gender) and Medicare match status. These results can be used to determine the maximum available sample size so that the feasibility of conducting analyses utilizing the linked CMS Medicare data can be assessed. This should be especially useful when you initiate an RDC proposal to analyze the restricted use NHANES-CMS linked Medicare files.
Example: Produce a frequency distribution of race/ethnicity and gender from the NHANES Demographic file data.
Information: Because this table is being used to evaluate the adequacy of the sample, sample weights were not used.
Download, open and run the SAS Medicare program. Note that this is the same program that was downloaded in Module 6. If you have already downloaded the program and altered the LIBNAME statements, you do not need to download it again.
Statements | Explanation |
---|---|
Note: These commands can be found at the top of the program under the heading “Course 2, Module 6, Task 1.” | |
options ls=120 ps=42 missing=’ ‘ nocenter validvarname=upcase compress=binary; | Log/List options. COMPRESS option reduces storage requirements for output datasets. |
libname nhanes ‘C:\NHANES’; | Provides location for reading and saving SAS datasets. |
Note: These commands can be found towards the end of the program under the heading “Course 2, Module 8, Task 1.” | |
proc format; value CMSMATCH 1 = ‘Linked’ 2 = ‘Not Linked’ 3 = ‘Linked-Child’ 9 = ‘Ineligible’; value GENDER 1 = ‘Male’ 2 = ‘Female’; value RACE_ONE 1 = ‘non-Hispanic white’ 2 = ‘non-Hispanic black’ 3 = ‘Mexican American/Other Hispanic’ 4 = ‘Other’; run; |
Creates formats, CMS_MEDICARE_MATCH, GENDER, and RACE_ONE, based on the values of the variables CMS_ MEDICARE_MATCH, RIAGENDR and RACETH, respectively. Race/ethnicity was recoded in Course 2, Module 7. |
title ‘Merged NHANES 2011-2012 Feasibility/Demographic Data’; | This TITLE statement specifies a title for the output. |
proc tabulate data= nhanes.merg1112_Medicare | The TABULATE procedure is used to produce a formatted table displaying counts. |
format=BEST10.0; | The default format for all cells is 10 spaces wide. |
class RACETH / order=unformatted missing; class RIAGENDR / order=unformatted missing; class CMS_MEDICARE_MATCH / order=formatted missing; |
The CLASS statement identifies RACETH, RIAGENDR and CMS_ MEDICARE_MATCH as class variables, having a small number of discrete and unique values. The MISSING option alerts SAS to include observations with missing values in the table. The ORDER=UNFORMATTED option specifies that the class values are ordered by their unformatted values, which yields the same order as PROC SORT. The ORDER=FORMATTED option specifies that the class values are ordered by the formatted representation of the value. |
table N=”Total” (RIAGENDR RACETH)*N=’ ‘ , /* Row Dimensions */ CMS_MEDICARE_MATCH=’ ‘ /* Column Dimension */ /ROW=FLOAT; |
The TABLE statement describes a two-dimensional table to be printed. The row dimension is composed of the total count, number of males and females, and number of race/ethnicity groups. The column dimension is composed of CMS match status types. |
format CMS_MEDICARE_MATCH cmsmatch. RIAGENDR gender. RACETH race_one.; | This FORMAT statement uses the user-written formats that have been previously defined in PROC FORMAT. |
title2 ‘Frequency Distribution of Race/ethnicity and Gender by Medicare Match Status’; | This TITLE statement specifies a title for the output. |
Check the results.
To check the results of your program, review the SAS log and output (.lst) reports.
Review the program output to examine contingency table cells.
Your output should resemble the table below. Note that the total count of NHANES-CMS Medicare eligible and linked participants is 1,183. This number is confirmed in the match rate table on the NCHS website, referenced in Module 9, that shows that the number of 2011-2012 NHANES participants linked to at least one year of the Medicare Denominator File is also 1,183.
Ineligible | Linked | Not Linked | |
Total |
2810 |
1183 |
5763 |
Gender | |||
Male |
1340 |
615 |
2901 |
Female |
1470 |
568 |
2862 |
Race/ethnicity (recode) | |||
non-Hispanic white |
677 |
558 |
1738 |
non-Hispanic black |
783 |
329 |
1571 |
Mexican American/Other Hispanic |
786 |
178 |
1467 |
Other |
564 |
118 |
987 |
Once the feasibility files have been merged to the NHANES Demographic File you can use the merged files to assess sample sizes by different demographic characteristics (e.g., race/ethnicity and gender) and Medicaid match status. These results can be used to determine the maximum available sample size so that the feasibility of conducting analyses utilizing the NHANES-CMS linked Medicaid data can be assessed. This should be especially useful when you initiate an RDC proposal to analyze the restricted use linked CMS Medicaid files.
Example: Produce a frequency distribution of race/ethnicity and gender from the NHANES Demographic File data.
Information: Because this table is being used to evaluate the adequacy of the sample, sample weights were not used.
Download, open and run the SAS Medicaid program. Note that this is the same program that was downloaded in Module 6. If you have already downloaded the program and altered the LIBNAME statements, you do not need to download it again.
Statements | Explanation |
---|---|
Note: These commands can be found at the top of the program under the heading “Course 2, Module 6, Task 1.” | |
options ls=120 ps=42 missing=’ ‘ nocenter validvarname=upcase compress=binary; | Log/List options. COMPRESS option reduces storage requirements for output datasets. |
libname nhanes ‘C:\NHANES’; | Provides location for reading and saving SAS datasets. |
Note: These commands can be found towards the end of the program under the heading “Course 2, Module 8, Task 1.” | |
proc format; value CMSMATCH 1 = ‘Linked’ 2 = ‘Not Linked’ 3 = ‘Linked-Child’ 9 = ‘Ineligible’; value GENDER 1 = ‘Male’ 2 = ‘Female’; value RACE_ONE 1 = ‘non-Hispanic white’ 2 = ‘non-Hispanic black’ 3 = ‘Mexican American/Other Hispanic’ 4 = ‘Other’; run; |
Creates formats, CMS_MEDICAID_MATCH, GENDER, and RACE_ONE, based on the values of the variables CMS_ MEDICAID_MATCH, RIAGENDR and RACETH (re-coded race/ethnicity variable), respectively. Race/ethnicity was recoded in Course 2, Module 7. |
title ‘Merged NHANES Feasibility/Demographic Data’; | This TITLE statement specifies the title of the output. |
proc tabulate data= nhanes.merg1112_Medicaid | The TABULATE procedure is used to produce a formatted table displaying total counts. |
format=BEST10.0; | The default format for all cells is 10 spaces wide. |
class RACETH / order=unformatted missing; class RIAGENDR / order=unformatted missing; class CMS_MEDICAID_MATCH / order=formatted missing; |
The CLASS statement identifies RACETH, RIAGENDR and CMS_ MEDICAID_MATCH as class variables, having a small number of discrete and unique values. The MISSING option alerts SAS to include observations with missing values in the table. The ORDER=UNFORMATTED option specifies that the class values are ordered by their unformatted values, which yields the same order as PROC SORT. The ORDER=FORMATTED option specifies that the class values are ordered by the formatted representation of the value. |
table N=”Total” (RIAGENDR RACETH)*N=’ ‘ , /* Row Dimensions */ CMS_MEDICAID_MATCH=’ ‘ /* Column Dimension */ /ROW=FLOAT; |
The TABLE statement describes a two-dimensional table to be printed. The row dimension is composed of the total count, number of males and females, and number of race/ethnicity groups. The column dimension is composed of Medicaid match status. |
format CMS_ MEDICAID_MATCH cmsmatch. RIAGENDR gender. RACETH race_one.; | This FORMAT statement uses the user-written formats that have been previously defined in PROC FORMAT. |
title2 ”Frequency Distribution of Race/ethnicity and Gender by Medicaid Match Status”; | This TITLE statement specifies the title of the output. |
Check the results.
To check the results of your program, review the SAS log and output (.lst) reports.
Review the program output to examine contingency table cells
Your output should resemble the table below. Note that the total count of NHANES-CMS Medicaid eligible and linked participants is 2,789, which is the sum of “linked” and “linked child” (1,480 + 1,309). This number is confirmed in the match rate table on the NCHS website, referenced in Module 9, that shows that the number of 2011-2012 NHANES participants linked to at least one year of the Medicaid Person Summary File is also 2,789.
Ineligible | Linked | Linked-Child | Not Linked | |
Total |
3938 |
1480 |
1309 |
3029 |
Gender | ||||
Male |
1884 |
630 |
671 |
1671 |
Female |
2054 |
850 |
638 |
1358 |
Race/ethnicity (recode) | ||||
non-Hispanic white |
865 |
468 |
283 |
1357 |
non-Hispanic black |
1007 |
540 |
455 |
681 |
Mexican American/Other Hispanic |
1284 |
279 |
402 |
466 |
Other |
782 |
193 |
169 |
525 |