CheckM Analysis

CheckM Analysis

The NCBI uses the CheckM analysis or calculation to assess the completeness and contamination of prokaryotic genome assemblies annotated by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) and selected for RefSeq. CheckM is performed on the proteins produced by PGAP using the set of CheckM markers for the species assigned to the genome. If no set of markers is available for the species, the closest available set in the lineage is used (e.g., family).

The position of the genome in the completeness distribution of all genomes for the same species is provided (see percentile value) in the individual genome web pages. In addition, if 15 or more genomes are available for the species, the position of the genome is shown graphically.

For more information on CheckM, see Parks, et al (2015).

Generated November 25, 2024