International Society of Science and Applied Technologies |
|
Machine Learning for Compositional Data Analysis in Support of the Decision Making Process | ||||
Author | Thi Thuy Van Nguyen
|
|||
Co-Author(s) | Cédric Heuchenne; Kim Phuc Tran
|
|||
Abstract | Due to the importance of ML in data analysis and its limited research on CoDa, in this work, we will summarize the most popular ML techniques on CoDa, including principal component analysis (PCA), clustering, classification, and regression. Besides, we will introduce an efficient transformation method based on Dirichlet density estimation to transform CoDa into real data. The proposed method can not only remove the constraint (nonnegative and constant-sum) on each CoDa vector, but also reduce its dimension and improve the quality of data. We also apply the transformed data deriving from this method in anomaly detection using Support Vector Data Description (SVDD), a one-class classification algorithm that allows us to detect abnormal observations by modeling the normal ones. To indicate the promise of this method in building classification models as well as anomaly detection models on CoDa, a simulation example will also be provided at the end of the work.
|
|||
Keywords | Compositional Data, Machine learning, Anomaly Detection, SVDD, Dirichlet density | |||
Article #: DSBFI23-19 |
January 8-10, 2023 - Da Nang, Vietnam |