Analysis of a token density metric for concern detection in Matlab sources using UbiSOM

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Matrix and data manipulation programming languages are an essential tool for data analysts. However, these languages are often unstructured and lack modularity mechanisms. This article presents a knowledge discovery approach for studying manifestations of the lack of modularity support in that sort of languages. The study is focused on Matlab, as a well-established representative of those languages. We present a technique for the automatic detection and quantification of concerns in Matlab and their exploration in a code base. The Ubiquitous Self Organizing Map (UbiSOM) is used to perform exploratory data analysis over concerns detected in a, possibly changing, repository of Matlab files. The UbiSOM is quite effective in detecting patterns of co-occurrence of multiple concerns. To illustrate the technique, a repository comprising over 35,000 Matlab files is analysed. The results show that the use of Token Density metrics in conjunction with UbiSOM enables the detection of patterns of co-occurrence of multiple concerns in m-files.

Original languageEnglish
Article numbere12306
JournalExpert Systems
Volume35
Issue number4
DOIs
Publication statusPublished - 1 Aug 2018

Keywords

  • business intelligence
  • concern metrics
  • concern mining
  • Matlab
  • modularity
  • self-organising maps
  • token-based technique

Fingerprint Dive into the research topics of 'Analysis of a token density metric for concern detection in Matlab sources using UbiSOM'. Together they form a unique fingerprint.

  • Cite this