2021 Project: Rice University

Harvest Variants: enhancing tools for integrated, collaborative variant tracking of SARS-CoV-2

What to know

Rice University enhanced current genomic sequencing tools with SARS-CoV-2 tracking and analysis software. Awarded in 2021, this project designed and developed genomic sequencing software to integrate SARS-CoV-2 genomic data sets. The software integration enabled users to download real-time virus data and variant analyses. Users also had access to support and feedback tools.

Decorative image with words "2021" and "SARS-CoV-2"

New and improved sequencing software tools

This project:

  • Extended the Harvest variant software suite to include SARS-CoV-2 specific tools.
  • Released NCBI SRA Run Suggester, a Java tool for suggesting which SARS-CoV-2 NCBI SRA runs should be processed next based on temporal and geographic coverage.
  • Created a Java library, NCBI SRA Full Xml Tools Java, for converting the NCBI SRA data into the "Full XML" datafile format.
  • Created Harvest Variants, an end-to-end pipeline for downloading data, identifying virus sequence differences in a single host, and mapping, aligning, and labeling sequence data from NCBI.
  • Created a website to host source files for HarvestVariants.info.
  • Developed, Variabel, an integrated framework that could accurately identify low frequency intra-host genomic variants using rapid and low-cost Oxford Nanopore Technologies.1
  • Developed a web interface for browsing biocurations and annotation of mutations within a host organism.
  • Developed the software tool QualD for earlier and more sensitive detection of new SARS-CoV-2 variants of concern in wastewater samples. 2
  • Applied computational methods developed to detect viral variants to determine that influenza virus diversity does not differ between samples obtained from the nose and mouth compared to samples obtained from aerosols.3