Using Taxonomy Tree to Generalize a Fuzzy Thematic Cluster

Dmitry Frolov, Susana Nascimento, Trevor Fenner, Boris Mirkin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Downloads (Pure)

Abstract

This paper presents an algorithm, ParGenFS, for generalizing, or 'lifting', a fuzzy set of topics to higher ranks of a hierarchical taxonomy of a research domain. The algorithm ParGenFS finds a globally optimal generalization of the topic set to minimize a penalty function, by balancing the number of introduced 'head subjects' and related errors, the 'gaps' and 'offshoots', differently weighted. This leads to a generalization of the topic set in the taxonomy. The usefulness of the method is illustrated on a set of 17685 abstracts of research papers on Data Science published in Springer journals for the past 20 years. We extracted a taxonomy of Data Science from the international Association for Computing Machinery Computing Classification System 2012 (ACM-CCS). We find fuzzy clusters of leaf topics over the text collection, lift them in the taxonomy, and interpret found head subjects to comment on the tendencies of current research.

Original languageEnglish
Title of host publication2019 IEEE International Conference on Fuzzy Systems, FUZZ 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538617281
DOIs
Publication statusPublished - Jun 2019
Event2019 IEEE International Conference on Fuzzy Systems, FUZZ 2019 - New Orleans, United States
Duration: 23 Jun 201926 Jun 2019

Publication series

NameIEEE International Conference on Fuzzy Systems
PublisherInstitute of Electrical and Electronics Engineers Inc.
Volume2019-June
ISSN (Print)1098-7584

Conference

Conference2019 IEEE International Conference on Fuzzy Systems, FUZZ 2019
CountryUnited States
CityNew Orleans
Period23/06/1926/06/19

Keywords

  • annotated suffix tree
  • fuzzy cluster
  • gap-offshoot penalty
  • generalization

Fingerprint Dive into the research topics of 'Using Taxonomy Tree to Generalize a Fuzzy Thematic Cluster'. Together they form a unique fingerprint.

Cite this