TY - JOUR
T1 - Exploring Exploratory Data Analysis
T2 - An Empirical Test of Run Chart Utility
AU - Barsalou, Matthew
AU - Saraiva, Pedro Manuel
AU - Henriques, Roberto
N1 - Barsalou, M., Saraiva, P. M., & Henriques, R. (2023). Exploring Exploratory Data Analysis: An Empirical Test of Run Chart Utility. Management Systems in Production Engineering, 31(4), 442-448. https://doi.org/10.2478/mspe-2023-0050
PY - 2023/12/1
Y1 - 2023/12/1
N2 - This paper explores Exploratory Data Analysis (EDA). Graphical methods are used to gain insights in EDA and these insights can be useful for forming tentative hypotheses when performing a root cause analysis (RCA). The topic of EDA is well addressed in the literature; however, empirical studies of the efficacy of EDA are lacking. We therefore aim to evaluate EDA by comparing one group of students identifying salient features in a table against a second group of students attempting to identify salient features in the same data presented in the form of a run chart, and then extracting relevant conclusions from such a comparison. Two groups of students were randomly selected to receive data; either in the form of a table or a run chart. They were then tasked with visually identifying any data points that stood out as interesting. The number of correctly identified values and the time to find the values were both evaluated by a two-sample t-test to determine if there was a statistically significant difference. The participants with a graph found the correct values that stood out in the data much quicker than those that used a table. Those using the data in the form of a table too much longer and failed to identify values that stood out. However, those with a graph also had far more false positives. Much has been written on the topic of EDA in the literature; however, an empirical evaluation of this common methodology is lacking. This paper confirms with empirical evidence the effectiveness of EDA.
AB - This paper explores Exploratory Data Analysis (EDA). Graphical methods are used to gain insights in EDA and these insights can be useful for forming tentative hypotheses when performing a root cause analysis (RCA). The topic of EDA is well addressed in the literature; however, empirical studies of the efficacy of EDA are lacking. We therefore aim to evaluate EDA by comparing one group of students identifying salient features in a table against a second group of students attempting to identify salient features in the same data presented in the form of a run chart, and then extracting relevant conclusions from such a comparison. Two groups of students were randomly selected to receive data; either in the form of a table or a run chart. They were then tasked with visually identifying any data points that stood out as interesting. The number of correctly identified values and the time to find the values were both evaluated by a two-sample t-test to determine if there was a statistically significant difference. The participants with a graph found the correct values that stood out in the data much quicker than those that used a table. Those using the data in the form of a table too much longer and failed to identify values that stood out. However, those with a graph also had far more false positives. Much has been written on the topic of EDA in the literature; however, an empirical evaluation of this common methodology is lacking. This paper confirms with empirical evidence the effectiveness of EDA.
KW - Exploratory Data Analysis
KW - graphs
KW - root cause analysis
KW - problem solving
UR - http://www.scopus.com/inward/record.url?scp=85179798161&partnerID=8YFLogxK
UR - https://www.webofscience.com/wos/woscc/full-record/WOS:001114518200009
U2 - 10.2478/mspe-2023-0050
DO - 10.2478/mspe-2023-0050
M3 - Article
SN - 2299-0461
VL - 31
SP - 442
EP - 448
JO - Management Systems in Production Engineering
JF - Management Systems in Production Engineering
IS - 4
ER -