Banca de QUALIFICAÇÃO: LUCAS MARQUES DA CUNHA

Uma banca de QUALIFICAÇÃO de DOUTORADO foi cadastrada pelo programa.
DISCENTE : LUCAS MARQUES DA CUNHA
DATA : 23/07/2018
HORA: 09:00
LOCAL: BioME
TÍTULO:

DEVELOPMENT  OF  COMPUTING  APPROACHES  FOR ANALYSIS AND IDENTIFICATION OF POLYMORPHIC PEPTIDE


PALAVRAS-CHAVES:

Polymorphism. Variant peptides. Custom database. Proteomics.


PÁGINAS: 57
GRANDE ÁREA: Ciências Biológicas
ÁREA: Biologia Geral
RESUMO:

The  proteomic  approach  allows  large-scale  studies  of  protein expression in different tissues and body fluids, aiming to identify and quantify the total protein content. In the proteomic analysis process, protein identification still presents limitations despite major advances in the area. Frequently, a mass spectrometer is used to generate mass/charge values  of the samples. After this process, a reference protein database (eg, UNIPROTI) is usually used to identify proteins. However,  using  a  reference  database  limits  the  analysis  of  the identification of the proteins, since it does not contain the variations in the DNA, that can impact the sequence of amino acids, causing incorrect  identification  or  making  the  process  impossible.  In  this context, there are several custom databases that incorporate such genetic variations. Although they present good results, they are also limited by a considerably increase in the search space, becoming another problem in the identification process. Thus, this research proposes the implementation of a database containing polymorphic peptides,  combining  information  contained  in  dbSNP  and  NCBI. Then, an hypothetical sequence is generated containing the mutated peptides  in  the  protein,  considering  their  allelic  frequency.  This process is complemented with analysis of the peptides identified, after the samples are submitted to the software identifier.  In  parallel,  a  search  is  performed  on  the  database  of reference  and  on  the  database  of  mutated  peptides,  allowing  a reduction of the search space, generating two outputs. Then, the uniqueness of the database peptides is checked and, if there is redundancy, the one with the best score is selected. The peptides identified using the mutated base are also classified according to the type  of  mutation,  allelic  frequency  and  pathogenicity.  For  the classification of the peptides, a machine learning approach was also developed, distinguishing them according to the non-mutated, SNP, INDEL and nonsense classes. For the tests, three data were used as input, HapMap and samples of ovarian and colon cancer. As a result, 3,013 new peptides were identified using the polymorphic base, of which 82% were SAPs, 13% were INDELs, 5% frameshifts and less than 1% corresponding to lost stop and UTR variation. Among the mutations,  some  were  related  to  nonsyndromic  deafness, hypomyelination with encephalic and spinal cord involvement and spasticity of the leg, Gaucher’s disease and breast cancer. For the data  from  ovarian  cancer  samples,  7,514  new  peptides  were identified, being 72.9% SAPs, 21.8% Frameshifts, 2.6% INDEL and less than 1% for Lost Stop and UTR variation. These mutations are also related to inflammatory bowel disease, segmental and focal glomerulosclerosis. For the colon samples, 3,965 new peptides were identified, being 75.4% SAPs, 20.4% Frameshift, 3.3% INDEL and less than 1% for lost stop and UTR variation. These mutations are also associated with amyotrophic lateral sclerosis and acute fatty liver  of  pregnancy.  Using  the  random  forest  algorithm  for classification we obtained a accuracy rate >89.7%. Therefore, our approach appears to be very promising regarding the established objective and applicable to analyzes with new samples.


MEMBROS DA BANCA:
Presidente - 2267860 - GUSTAVO ANTONIO DE SOUZA
Interno - 012.117.554-52 - THAIS GAUDENCIO DO REGO - UFPE
Notícia cadastrada em: 03/07/2018 11:18
SIGAA | Superintendência de Tecnologia da Informação - (84) 3342 2210 | Copyright © 2006-2024 - UFRN - sigaa02-producao.info.ufrn.br.sigaa02-producao