Data analysis with intersection graphs

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a new framework for multivariate data analysis, based on graph theory, using intersection graphs [1]. We have named this approach DAIG - Data Analysis with Intersection Graphs. This new framework represents data vectors as paths on a graph, which has a number of advantages over the classical table representation of data. To do so, each node represents an atom of information, i.e. a pair of a variable and a value, associated with the set of observations for which that pair occurs. An edge exists between a pair of nodes whenever the intersection of their respective sets is not empty. We show that this representation of data as an intersection graph allows an easy and intuitive geometric interpretation of data observations, groups of observations, and results of multivariate data analysis techniques such as biplots, principal components, cluster analysis, or multidimensional scaling. These will appear as paths on the graph, relating variables, values and observations. This approach allows for a compact and memory efficient representation of data that contains many missing values or multi-valued attributes. The basic principles and advantages of this approach are presented with an example of its application to a simple toy problem. The main features of this methodology are illustrated with the aid software specifically developed for this purpose. (C) 2013 The Authors. Published by Elsevier B.V. Selection and peer review under responsibility of the organizers of the 2013 International Conference on Computational Science
Original languageUnknown
Title of host publicationProcedia Computer Science
EditorsV Alexandrov, M Lees, V Krzhizhanovskaya, J Dongarra, PMA Sloot
PublisherELSEVIER SCIENCE BV
Pages60-69
Volume18
ISBN (Print)1877-0509
Publication statusPublished - 1 Jan 2013
EventInternational Conference on Computational Science -
Duration: 1 Jan 2013 → …

Conference

ConferenceInternational Conference on Computational Science
Period1/01/13 → …

Cite this