Computational Generalization in Taxonomies Applied to: (1) Analyze Tendencies of Research and (2) Extend User Audiences

Dmitry Frolov, Susana Nascimento, Trevor Fenner, Zina Taran, Boris Mirkin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Downloads (Pure)

Abstract

We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its “head subject” node in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly bringing in some errors referred to as “gaps” and “offshoots”. Our method, ParGenFS, globally minimizes a penalty function combining the numbers of head subjects and gaps and offshoots, differently weighted. Two applications are considered: (1) analysis of tendencies of research in Data Science; (2) audience extending for programmatic targeted advertising online. The former involves a taxonomy of Data Science derived from the celebrated ACM Computing Classification System 2012. Based on a collection of research papers published by Springer 1998–2017, and applying in-house methods for text analysis and fuzzy clustering, we derive fuzzy clusters of leaf topics in learning, retrieval and clustering. The head subjects of these clusters inform us of some general tendencies of the research. The latter involves publicly available IAB Tech Lab Content Taxonomy. Each of about 25 mln users is assigned with a fuzzy profile within this taxonomy, which is generalized offline using ParGenFS. Our experiments show that these head subjects effectively extend the size of targeted audiences at least twice without loosing quality.

Original languageEnglish
Title of host publicationIntelligent Data Engineering and Automated Learning – IDEAL 2019 - 20th International Conference, Proceedings
EditorsHujun Yin, Richard Allmendinger, David Camacho, Peter Tino, Antonio J. Tallón-Ballesteros, Ronaldo Menezes
PublisherSpringer
Pages3-11
Number of pages9
ISBN (Print)9783030336165
DOIs
Publication statusPublished - 2019
Event20th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2019 - Manchester, United Kingdom
Duration: 14 Nov 201916 Nov 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer
Volume11872 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference20th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2019
CountryUnited Kingdom
CityManchester
Period14/11/1916/11/19

Keywords

  • Annotated suffix tree
  • Fuzzy thematic cluster
  • Generalization
  • Research tendencies
  • Targeted advertising

Fingerprint Dive into the research topics of 'Computational Generalization in Taxonomies Applied to: (1) Analyze Tendencies of Research and (2) Extend User Audiences'. Together they form a unique fingerprint.

  • Cite this

    Frolov, D., Nascimento, S., Fenner, T., Taran, Z., & Mirkin, B. (2019). Computational Generalization in Taxonomies Applied to: (1) Analyze Tendencies of Research and (2) Extend User Audiences. In H. Yin, R. Allmendinger, D. Camacho, P. Tino, A. J. Tallón-Ballesteros, & R. Menezes (Eds.), Intelligent Data Engineering and Automated Learning – IDEAL 2019 - 20th International Conference, Proceedings (pp. 3-11). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11872 LNCS). Springer. https://doi.org/10.1007/978-3-030-33617-2_1