How to contribute annotations to PR2 reference database
- Contact one member of the Core Team
- Explain which group you want to annotate. It can be a genus, a class or any other taxonomic level.
- We will send you an Excel file and a fasta file containing all existing PR2 sequences for the group you are expert in.
- You can alternatively download the data from the web interface
- Follow the instructions below to update or add data.
- Send back the updated Excel file
- Your contribution will be added to the next release of PR2 (we are doing 2 to 3 releases per year).
- You will be acknowledged as a contributor on the PR2 web site
Two files will be provided to you
- An excel file with 2 sheets (taxonomy, sequences)
- A fasta file with the current taxonomy
Please edit the Excel file by marking all your changes in yellow.
Excel - Taxonomy - do not edit
- This sheet provides a summary of the current taxonomy of the group with the number of sequences for each species (n).
- Please do not edit this file directly, this is only for your information.
Excel - Sequences - edit only this file
- Each sequence has a unique identifier (pr2_accession) which is based on the GenBank accession (genbank_accession).
- For each sequence, the full taxonomic path is provided along with metadata (see here for a full description of the fields).
Modifying or adding entries
Only change entries in the Sequence table
- You can
- modify the taxonomy of a given entry
- add new metadata. If your metadata do not fit the existing columns, just add more columns and we will see how to incorporate them.
- You can change the ranks (supergroup to genus) if necessary but you must make sure that:
- you follow exactly the PR2 conventions which are detailed here (see the second paragraph). In particular, any taxonomic name can only appear in a single column (taxonomic level). Use the _X convention to distinguish different levels with the same name.
- you are consistent for all sequences belonging to the same taxon.
- You can also add new species as needed.
- Please see the figure above for some examples of changes
- 1 - These entries are unchanged
- 2 - These entries have been reassigned to a new species
- 3 - These are new entries. Provide the following information
- genbank_accession. We will download the sequence and all genbank metadata, so no need for you to do it
- taxonomy assignation. If the species is already present in the database you can just provide the species name
- if the sequence is not limited to the 18S, but also contains the ITS, please provide the coordinates on the sequence of the start and end of the 18S rRNA gene.
- 4 - You can also indicate whether the new sequence is a reference sequence. Reference sequences are have a high quality, preferentially full length and are representative a given taxon.