Analysis of a token density metric for concern detection in Matlab sources using UbiSOM

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)


Matrix and data manipulation programming languages are an essential tool for data analysts. However, these languages are often unstructured and lack modularity mechanisms. This article presents a knowledge discovery approach for studying manifestations of the lack of modularity support in that sort of languages. The study is focused on Matlab, as a well-established representative of those languages. We present a technique for the automatic detection and quantification of concerns in Matlab and their exploration in a code base. The Ubiquitous Self Organizing Map (UbiSOM) is used to perform exploratory data analysis over concerns detected in a, possibly changing, repository of Matlab files. The UbiSOM is quite effective in detecting patterns of co-occurrence of multiple concerns. To illustrate the technique, a repository comprising over 35,000 Matlab files is analysed. The results show that the use of Token Density metrics in conjunction with UbiSOM enables the detection of patterns of co-occurrence of multiple concerns in m-files.

Original languageEnglish
Article numbere12306
JournalExpert Systems
Issue number4
Publication statusPublished - 1 Aug 2018


  • business intelligence
  • concern metrics
  • concern mining
  • Matlab
  • modularity
  • self-organising maps
  • token-based technique


Dive into the research topics of 'Analysis of a token density metric for concern detection in Matlab sources using UbiSOM'. Together they form a unique fingerprint.

Cite this