TY - JOUR
T1 - Comparison of Normalization Techniques on Data Sets With Outliers
AU - Vafaei, Nazanin
AU - Ribeiro, Rita A.
AU - Camarinha-Matos, Luís M.
N1 - info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UID%2FEEA%2F00066%2F2019/PT#
PY - 2022/1
Y1 - 2022/1
N2 - With the fast growth of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset (i.e., a dataset which contains asymmetries and/or outliers). Normalization is the first step of most multi-criteria decision-making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in the presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, the authors compare seven normalization techniques (max, max-min, vector, sum, logarithmic, target-based, and fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.
AB - With the fast growth of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset (i.e., a dataset which contains asymmetries and/or outliers). Normalization is the first step of most multi-criteria decision-making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in the presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, the authors compare seven normalization techniques (max, max-min, vector, sum, logarithmic, target-based, and fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.
KW - Data Set
KW - Decision Making
KW - Fuzzification
KW - MCDM
KW - Normalization
KW - Outliers
KW - Skewed Data
KW - Target Value
UR - http://www.scopus.com/inward/record.url?scp=85148709974&partnerID=8YFLogxK
U2 - 10.4018/IJDSST.286184
DO - 10.4018/IJDSST.286184
M3 - Article
SN - 1941-6296
VL - 14
JO - International Journal of Decision Support System Technology
JF - International Journal of Decision Support System Technology
IS - 1
M1 - 84
ER -