Research Highlights

New guidelines for identifying novel viruses

Published online 20 December 2018

An international consortium has issued guidelines and best practices for defining virus data quality.

Islam Elkholi

Scientists from more than fifty institutions, including Cairo University in Egypt, assembled consensus guidelines for reporting and analysing the genome sequences of previously unidentified viruses.

Viruses are the most abundant biological entities on the planet. They not only contribute to developing a wide range of human diseases, they also help maintain ecosystems. 

Although thousands of viruses have had their genomes identified and have been grown in laboratories, many more have not. In fact, the genomes of more than 750,000 uncultivated viruses have been identified in the past two years due to advances in genome sequencing technologies. 

“Because we are seeing an incredibly large number of virus genomes in sequencing data from all types of samples, we believe the community needed a set of standards to analyse these data,” says the study’s lead author, Simon Roux from the US Department of Energy Joint Genome Institute. 

Cultured viruses, which have been grown in the lab, already have their own data quality standards, but these cannot be directly applied to uncultured viruses, whose sequences are often incomplete and for which some properties can only be predicted indirectly using computational approaches.

In their guidelines, the team outlined the minimum amount of information needed for an uncultivated virus genome, including the source, methods of identification of the virus genome, and data quality. 

They propose three categories of genome quality: genome fragments that are less than 90% complete, or have no estimated genome size, and are minimally annotated; a high-quality draft genome that represents 90% or more of the complete expected genome sequence; and a finished genome.

This categorization can guide multiple downstream applications, such as drawing evolutionary trees of novel viruses, in addition to predicting their possible host interactions. 


Roux, S. et al. Minimum information about an uncultivated virus genome (MIUViG). Nat. Biotechnol. (2018).