2A AND 2A-LIKE SEQUENCES IN VIRUSES WITH RNA GENOME: OCCURRENCES AND EVOLUTIONARY IMPLICATIONS
2A/2A-like sequences, Pseudo 2A-like, 2A-like ranking, Totiviridae family, Expanded Giardiavirus, IMNV-like, GLV-like.
2A/2A-like sequences are oligopeptides that have approximately 18-22 amino acids and can mediate a co-translational "cleavage" of polyproteins in eukaryotic cells. These peptides are characterized by having a C-terminal motif of nine amino acids -(G/H)D(V/I)EXNPGP. These sequences are found in many viruses with positive single stranded and duble stranded RNA genomes that infect a variety of hosts ranging from protozoan to vertebrate species. Due to their cleavage capacity, 2A/2A-like sequences have been used in many heterologous co-expression systems. In view of the importance of 2A/2A-like sequences, the present work aimed to investigate and record the presence of these sequences in different virus species, and to assess the evolutionary importance of these sequences in the Totiviridae family. The first chapter of this work presents a review of 2A/2A-like sequences occurrence in different viral genomes through the alignment of these sequences with the NCBI database using the Blastp tool. This approach also enabled the identification of 69 new reports of viral sequences that contain 2A-likes, among which, 62 are pssRNA viruses, 6 dsRNA viruses and one virus with a negative-sense single-stranded RNA genome (nssRNA). This is the first report of this type of sequence in a nssRNA virus. In the second chapter, the phylogenetic relationships among the Totiviridae family viruses are presented through a bayesian inference, from which it was possible to suggest the creation of the expanded Giardiavirus group, which comprises the viruses previously grouped in Giardiavirus, Artivirus and eight new viruses, which were divided into IMNV-like and GLV-like groups and had their ORF1 characterized using bioinformatics tools. During this characterization, sequences similar to 2A-likes were found in some viruses of the GLV-like group, which were called pseudo 2A-likes. To assess whether these sequences could become functional 2A-likes, the 2A-like ranking software was developed in this study. The analyzes carried out in this program showed that viruses from the GLV-like group that infect arthropods have approximately 50 to 76% chance of becoming functional 2A-likes. The presence of pseudo 2A-likes in other dsRNA viruses was also verified using the NCBI database where it was possible to find 96 sequences that presented pseudo 2A-likes. The results presented in this study reinforce the diversity of hosts and genome structure of the Totiviridae family viruses, which can be explained by the emergence of functional 2A-like sequences from ancestral sequences described here as pseudo 2A-like. Thus, pseudo 2A-likes can be a key point in viral evolution, acting as a way to increase or maintain the complexity of the genome.