Banca de QUALIFICAÇÃO: KARLA CRISTINA TABOSA MACHADO

Uma banca de QUALIFICAÇÃO de MESTRADO foi cadastrada pelo programa.
DISCENTE : KARLA CRISTINA TABOSA MACHADO
DATA : 11/06/2018
HORA: 09:00
LOCAL: Sala de aula BioME
TÍTULO:

Development of computational approaches for prokaryote proteogenomics


PALAVRAS-CHAVES:

Proteomics, proteogenomics, mass spectrometry, prokaryotes, databases


PÁGINAS: 60
GRANDE ÁREA: Ciências Exatas e da Terra
ÁREA: Ciência da Computação
RESUMO:

Next-generation sequencers development cause a revolution in genomic research, and nowadays the complete genomic information of thousands of bacterial strains is available. Similar technological breakthroughs also happened for protein analysis by mass spectrometry (MS) in the last decade regarding sensitivity and throughput. However, proteomics is yet to reach the same level of throughput of genomics, but for samples from simple eukaryotic organisms such as yeasts or bacteria, proteomics is able to detect and quantify their proteome close to completeness. There are still challenges regarding the characterization of coding regions in a genome, as well as in the validation of genomic models. Scientific reports show genomic annotation performed over the same genomic data using independent approaches resulted in divergent data regarding the number of predicted ORFs and also their length (i.e. different choices for transcription/translation initiation). Peptide sequence characterization in proteomics samples can be used to validate genomic regions as coding, research field known as proteogenomics. For such, the design of customized sequence databases which allows the identification of new genomic regions previously predicted to be no-coding and therefore absent in routinely employed databases. In this work, customized databases were created through processing and analysis of complete genomic information from several strains of the same bacterial species, aiming to characterize proteomic datasets from samples collected from same species but unknown genomic background. The approach compares homologous and uniquely annotated proteins in all strains, and reports those sequences in a non-redundant manner. Databases for 10 strains containing at least 65 complete genomes were created, and original MS data collected from clinical strains of Mycobacterium tuberculosis were used as a performance control of the approach. The results show that the database created by this approached identified 98.7% of the expected peptides which were identified using a merged database of all strain, by only a fraction of the computational time.


MEMBROS DA BANCA:
Presidente - 2267860 - GUSTAVO ANTONIO DE SOUZA
Interno - 1513597 - JOAO PAULO MATOS SANTOS LIMA
Notícia cadastrada em: 28/05/2018 14:46
SIGAA | Superintendência de Tecnologia da Informação - (84) 3342 2210 | Copyright © 2006-2024 - UFRN - sigaa12-producao.info.ufrn.br.sigaa12-producao