Portal de Programas de Pós-Graduação (UFRN)

SIGAA - Sistema Integrado de Gestão de Atividades Acadêmicas

PPgSC/UFRN PROGRAMA DE PÓS-GRADUAÇÃO EM SISTEMAS E COMPUTAÇÃO ADMINISTRAÇÃO DO CCET Telefone/Ramal: (84)3342-2225/115 E-mail: ppgsc@ppgsc.ufrn.br https://posgraduacao.ufrn.br/ppgsc

Banca de DEFESA: DENIS JOSÉ SOUSA DE ALBUQUERQUE

Uma banca de DEFESA de MESTRADO foi cadastrada pelo programa.
DISCENTE : DENIS JOSÉ SOUSA DE ALBUQUERQUE
DATA : 27/09/2019
HORA: 08:00
LOCAL: DIMAp, Auditório I
TÍTULO:

Identification of problems and hot topics for developers of Big Data applications on the Apache Spark framework

PALAVRAS-CHAVES:

Big Data, Apache Spark, Probabilistic Topic Models, Latent Dirichlet Allocation (LDA), Stack Overflow, Taxonomy.

PÁGINAS: 115
GRANDE ÁREA: Ciências Exatas e da Terra
ÁREA: Ciência da Computação
SUBÁREA: Metodologia e Técnicas da Computação
ESPECIALIDADE: Engenharia de Software
RESUMO:

This research aims to identify and classify the main difficulties and issues of interest of Apache Spark application developers regarding the framewok usage. For this purpose, we use the Latent Dirichlet Allocation algorithm to perform a probabilistic modeling of topics on information extracted from Stack Overflow, since the manual inspection of the entire dataset is not feasible. From the knowledge obtained by the comprehensive study of related works, we established and applied a methodology based on the practices usually employed. We developed Spark applications for the automated execution os tasks, such as the data selection and preparation, the discovery of topics - applying the probabilistic modeling algorithm with various configurations - and metrics computation. Analyzes of the results were carried by a group of 5 researchers: two doctor professors, one doctoral student and two master students. Based on the semantic analysis of the labels assigned to each of the identified topics, a taxonomy of interests and difficulties was constructed. Finally, we ranked the most important themes according to the various calculated metrics and compared the methods and results of our study with those presented in another work.

MEMBROS DA BANCA:
Presidente - 1495704 - UMBERTO SOUZA DA COSTA
Interno - 1221251 - MARTIN ALEJANDRO MUSICANTE
Externo ao Programa - 1066308 - MARCUS ALEXANDRE NUNES
Externo à Instituição - PLACIDO ANTONIO DE SOUZA NETO - IFRN

Notícia cadastrada em: 12/09/2019 17:27