Evaluating the Effectiveness of Mapper Compared to PCA and Factor Analysis in High-Dimensional Survival Analysis
Utih Amartiwi, Yaroslav A. Kholodov

Innopolis University, Russia


Abstract

Survival analysis is a field of statistics used to analyze the time until an event occurs. In public health, this analysis is crucial for predicting the time until a patient recovers or dies, managing bed occupancy, and more. Currently, the development of technology in biology has produced the availability of vast amounts of data and raises the challenge in handling high dimensional data. This issue brings the need of approaches for dimensionality reduction and feature selection to provide a better prediction and knowledge discovery. Topological data analysis (TDA) is a new branch of data science that analyzes the structure of data. Mapper, a TDA approach, combines clustering and graph networks techniques to uncover insights from data that help us to select some important features in high dimensional survival analysis. To evaluate the effectiveness of this feature selection, we compare the C-Index of the survival model with those obtained from Principal Component Analysis (PCA) and Factor Analysis. The results show that Mapper not only improves the existing survival model but also provides valuable information about the features that affect survival time.

Keywords: topological data analysis, mapper, survival analysis, high dimensional data

Topic: Probability and Stochastic Analysis

ICONMAA 2024 Conference | Conference Management System