Banca de QUALIFICAÇÃO: DENIS JOSÉ SOUSA DE ALBUQUERQUE

Uma banca de QUALIFICAÇÃO de MESTRADO foi cadastrada pelo programa.
DISCENTE : DENIS JOSÉ SOUSA DE ALBUQUERQUE
DATA : 28/03/2019
HORA: 14:00
LOCAL: DIMAp, Auditório I
TÍTULO:

Identification of problems and hot topics for developers of Big Data applications on the Apache Spark framework


PALAVRAS-CHAVES:

Big Data, Apache Spark, Probabilistic Topic Models, Latent Dirichlet Allocation (LDA), Stack Overflow, Taxonomy.


PÁGINAS: 83
GRANDE ÁREA: Ciências Exatas e da Terra
ÁREA: Ciência da Computação
SUBÁREA: Metodologia e Técnicas da Computação
ESPECIALIDADE: Engenharia de Software
RESUMO:

This research aims to identify and classify the main difficulties and issues of interest of application developers regarding the processing of Big Data using the framework Apache Spark. In this sense, we use the Latent Dirichlet Allocation algorithm to perform a probabilistic modeling of topics on information extracted from Stack Overflow, since it is not feasible to manually inspect the entire data set. Starting with the comprehensive study of related works, we established and applied a methodology, as well as constructed a Spark application to execute the tasks, using the Spark SQL and MLlib libraries (for machine learning). Analyzes of the results were carried out by a group of 5 researchers: two doctor professors, one doctoral student and two master students. From the semantic analysis of the labels assigned to each of the identified topics, a taxonomy of interests and difficulties was constructed.


MEMBROS DA BANCA:
Presidente - 1495704 - UMBERTO SOUZA DA COSTA
Interno - 1221251 - MARTIN ALEJANDRO MUSICANTE
Externo ao Programa - 1066308 - MARCUS ALEXANDRE NUNES
Notícia cadastrada em: 19/03/2019 08:51
SIGAA | Superintendência de Informática - | | Copyright © 2006-2020 - UFRN - sigaa19-producao.info.ufrn.br.sigaa19-producao