Globally Optimal Parsimoniously Lifting a Fuzzy Query Set Over a Taxonomy Tree

Dmitry Frolov, Boris Mirkin, Susana Nascimento, Trevor Fenner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a relatively rare case of an optimization problem in data analysis to admit a globally optimal solution by a recursive algorithm. We are concerned with finding a most specific generalization of a fuzzy set of topics assigned to leaves of domain taxonomy represented by a rooted tree. The idea is to “lift” the set to its “head subject” in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly bringing in some errors, either “gaps” or “offshoots” or both. Our method globally minimizes a penalty function combining the numbers of head subjects and gaps and offshoots, differently weighted. We apply this to a collection of 17645 research papers on Data Science published in 17 Springer journals for the past 20 years. We extract a taxonomy of Data Science (TDS) from the international Association for Computing Machinery Computing Classification System 2012. We find fuzzy clusters of leaf topics over the text collection, optimally lift them to head subjects in TDS, and comment on the tendencies of current research following from the lifting results.

Original languageEnglish
Title of host publicationOptimization of Complex Systems: Theory, Models, Algorithms and Applications, 2019
EditorsHoai An Le Thi, Hoai Minh Le, Tao Pham Dinh
Place of PublicationCham
PublisherSpringer
Pages779-789
Number of pages11
ISBN (Electronic)978-3-030-21803-4
ISBN (Print)978-3-030-21802-7
DOIs
Publication statusPublished - 1 Jan 2020
Event6th World Congress on Global Optimization, WCGO 2019 - Metz, France
Duration: 8 Jul 201910 Jul 2019

Publication series

NameAdvances in Intelligent Systems and Computing
PublisherSpringer
Volume991
ISSN (Print)2194-5357
ISSN (Electronic)2194-5365

Conference

Conference6th World Congress on Global Optimization, WCGO 2019
CountryFrance
CityMetz
Period8/07/1910/07/19

Keywords

  • Additive fuzzy cluster
  • Annotated suffix tree
  • Generalization
  • Hierarchical taxonomy
  • Parsimony
  • Spectral clustering

Fingerprint Dive into the research topics of 'Globally Optimal Parsimoniously Lifting a Fuzzy Query Set Over a Taxonomy Tree'. Together they form a unique fingerprint.

  • Cite this

    Frolov, D., Mirkin, B., Nascimento, S., & Fenner, T. (2020). Globally Optimal Parsimoniously Lifting a Fuzzy Query Set Over a Taxonomy Tree. In H. A. Le Thi, H. M. Le, & T. Pham Dinh (Eds.), Optimization of Complex Systems: Theory, Models, Algorithms and Applications, 2019 (pp. 779-789). (Advances in Intelligent Systems and Computing; Vol. 991). Cham: Springer. https://doi.org/10.1007/978-3-030-21803-4_78