In 2001 the first clone library based eukaryotic environmental surveys were published using the 18S rRNA gene as a barcode. Since then a lot of studies have used this approach to describe protistan communities in a wide range of environments. The emergence of high-throughput sequencing (HTS) techniques has made this approach even easier to apply. Consequently, the amount of data retrieved has dramatically increased and our knowledge of diversity is ever increasing. But there are pitfalls behind the HTS approaches, which require using, and trusting, reference databases to annotate our data. These databases sometimes contain curation errors and other mistakes that potentially alter our overall view of protistan diversity within and across ecosystems.
The 18S rRNA Collaborative Annotation Initiative is a community-wide effort that addresses these challenges by bringing together people with expertise in diverse eukaryotic lineages to curate 18S rDNA data using phylogenetic methods. Our goal is to assemble a curated reference database spanning the eukaryotic tree of life. This will be a community resource consisting of curated sequences, flexible taxonomy, phylogenetic trees and their underlying sequence alignments. This database will increase the power of HTS-based studies to uncover fundamental patterns in microbial ecology and diversity. Along the way, individual curators are quite likely to identify novel eukaryotic clades and gain new insight into the environmental distribution patterns of eukaryotic microbes.
- Pipeline description
- Guidelines for taxonomy annotations
- Guidelines for constructing final database
- Annotation of short reads