We have continued our effort to standardize the processing of metagenomics data and have extended the quality control procedures for processing raw Illumina sequences to steps involved in metagenome assembly and gene prediction. Furthermore, we have agreed on methods to generate sets of non-redundant genes, which is an issue of increasing importance due to the amount of data produced by large scale projects, and published a tool which implements these methods. We investigate strategies for functional assessment of metagenomes and outline how orthology analysis is required to efficiently allow such assessment. Consequently, we have developed a benchmark for optimizing and quality-assuring orthologous group construction, and used this to evaluate different resources, as well as to create a new and improved version of the eggNOG orthologous group resource, which is particularly well suited for metagenome analysis due to its comprehensive prokaryotic coverage. We have investigated and outlined a strategy for using orthologous group assignment to transfer functional annotations to novel bacterial genomes or to metagenome reference gene catalogs and thereby produce functional potential profiles for metagenomic samples. These approaches are implemented in the upcoming version 4.1 of eggNOG and can therefore be directly used by the community to conduct such analyses as are outlined here. This SOP aims to recommend a methodology for function assignments based on orthologous groups as well as establish guiding principles for the assessment of quality of said groups.

