MODULAR TAXONOMIC AND FUNCTIONAL ANNOTATION TRANSFERRING PROTOCOL FOR METAGENOMES
Protocol, functional analysis, taxonomic analysis, metagenomics, annotation transfer, read classification.
The metagenomics involves the study of the microbial community found in a sample extracted from a given environment. This environment may be the wall of a cave, a portion of water from the ocean, the human gut, or any source containing microorganisms of interest. The analysis of amicrobial community unravels details about the sample taxonomic composition, and the function performed by the community. Some steps are required to perform a comprehensive metagenomic analysis, such as data preprocessing, the alignment against a reference database, the taxonomic analysis, and the functional analysis. Although there are several tools available for each one of these steps. Thus, the selection of an efficient set of tools to perform the analysis remains challenging. This study aims to propose a modular protocol to be used in taxonomic and functional analysis of metagenomes. A python script was implemented to perform the functional analysis, and the third-party tools most suited for each step were selected. Two datasets were simulated, to assess taxonomic and functional results separately, and both analyses were applied to a real metagenome. The same datasets were analyzed using the MEGAN 6 software, and the results obtained by both methodologies were compared. Metrics obtained from the results indicate that the proposed protocol is more suited to functional analysis using Gene Ontology identifiers, and correctly identified more species and genera.