The PR2 database was initiated in 2010 in the frame of the BioMarks project from work that had developed in the previous ten years in the Plankton Group of the Station Biologique of Roscoff. Its aim is to provide a reference database of carefully annotated 18S rRNA sequences using eight unique taxonomic fields (from kingdom to species). At present it contains about 184,000 sequences. A number of metadata fields are available for many sequences, including geo-localisation, whether it originates from a culture or a natural sample, host type etc… The annotation of PR2 is performed by experts from each taxonomic groups. One very important project in this respect is EukRef which has recently decided to merge its effort with PR2. EukRef has built bioinformatics pipelines that have been used during three workshops dedicated to specific taxonomic groups. As an example, part of the ciliate annotation originate from the first EukRef workshop.
Guillou, L., Bachar, D., Audic, S., Bass, D., Berney, C., Bittner, L., Boutte, C. et al. 2013. The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy. Nucleic Acids Res. 41:D597–604.
del Campo J., Kolisko M., Boscaro V., Santoferrara LF., Nenarokov S., Massana R., Guillou L., Simpson A., Berney C., de Vargas C., Brown MW., Keeling PJ., Wegener Parfrey L. 2018. EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution. PLOS Biology 16:e2005849. DOI: 10.1371/journal.pbio.2005849.
Report on GitHub
The PR2 database is provided in 3 different formats:
Flat files for use with mothur, QIIME2, dada2, blast etc…
R package pr2database
Use the following code under R
We are also developing a PR2 primer database focusing on the rRNA operon.
You can contribute to the PR2 database in different ways.
Microbial Ecology and Evolution
Tree of eukaryotes, phylogenomics, metabarcoding, long-reads, protists, diversity