Diversifying the Data Science Workforce: The DataMosaic Program

At a glance

To build a diverse public health workforce leveraging diversity, equity, inclusion, and accessibility (DEIA) and health equity in analytics, data, and innovation.

Infographic of the DataMosaic fellows.

Description

The Division of Global Migration and Quarantine (DGMQ) Office of Innovation, Development, Evaluation, and Analytics (IDEA) intentionally established a diverse team of fellows to develop DataMosaic, a CARES Act-funded proof-of-concept program that seeks to connect and harness population movement, epidemiological, and genomic data to increase public health response efficiency. In addition to IDEA's core analytics team, DataMosaic consists of 11 fellows from different programs, including Oak Ridge Institute for Science and Education (ORISE), Public Health Analytics and Modeling (PHAM) Program, and Public Health Informatics Fellowship Program (PHIFP), as well as health communications fellows, who receive intermediate and advanced data science upskilling to help fill critical needs within the agency. IDEA aims to refine health equity science by leveraging the lived experiences of the diverse team of fellows in data science approaches and methodologies. The DataMosaic team created an analytics platform that combines data from disparate sources to enhance analyses.

In the spring of 2021, IDEA launched the DataMosaic Program with three goals in mind:

  1. Establishing a diverse team of fellows who represent the communities CDC serves.
  2. Increasing access to cutting-edge data science and analysis skills.
  3. Leveraging the lived experiences of a diverse team to create real-world solutions to real-world problems.

Through the DataMosaic Program, cutting-edge data science skills were provided to members of the public health workforce who historically have been underrepresented in the data science field. Fellows were presented with several professional development modules, including how to create a CV, build a federal resume, develop a BLUF ("Bottom Line Up Front") statement, select a peer-reviewed journal, and code their own professional website. IDEA also organized a data science career panel consisting of several CDC employees, fellows of other programs, contractors, consultants, and industry workers so DataMosaic fellows could learn about different ways they could apply their skills after their fellowship.

All fellows received one-on-one mentoring routinely, as well as project mentoring. Fellows were able to propose their own projects, recruit collaborators within and beyond the IDEA team, assist other groups within and beyond DGMQ, participate in emergency response deployments, and join ongoing program evaluation projects. The fellows created and led their own study group, and surveyed topics including machine learning, infectious disease modeling, and health equity. First-year fellows held a showcase for DGMQ in June 2022 to present their work from the past year.

Early Impact

DataMosaic fellows collaborated within and beyond DGMQ to provide innovative and advanced analytic solutions to public health problems. Currently, fellows are involved in over 20 active projects, both within and beyond DGMQ. Several manuscripts are in development for publication, including for a project that uses supervised machine learning to identify areas with low COVID-19 vaccine uptake.

DataMosaic is helping to build a robust and diverse public health workforce. Fellows have taken the skills they've learned through the DataMosaic program and have gone on to excel in public and private sector roles as data scientists, analysts, and engineers, including contracting positions (both at CDC and beyond), industry positions, and full-time positions at CDC. Additionally, some fellows have transitioned into graduate school or other CDC fellowships, continuing to advance their professional skills and training.