Skip to main content

Research Repository

Advanced Search

Practical approaches to mining of clinical datasets : from frameworks to novel feature selection

Poolsawad, Nongnuch

Authors

Nongnuch Poolsawad



Contributors

Abstract

Research has investigated clinical data that have embedded within them numerous complexities and uncertainties in the form of missing values, class imbalances and high dimensionality. The research in this thesis was motivated by these challenges to minimise these problems whilst, at the same time, maximising classification performance of data and also selecting the significant subset of variables. As such, this led to the proposal of a data mining framework and feature selection method. The proposed framework has a simple algorithmic framework and makes use of a modified form of existing frameworks to address a variety of different data issues, called the Handling Clinical Data Framework (HCDF). The assessment of data mining techniques reveals that missing values imputation and resampling data for class balancing can improve the performance of classification. Next, the proposed feature selection method was introduced; it involves projecting onto principal component method (FS-PPC) and draws on ideas from both feature extraction and feature selection to select a significant subset of features from the data. This method selects features that have high correlation with the principal component by applying symmetrical uncertainty (SU). However, irrelevant and redundant features are removed by using mutual information (MI). However, this method provides confidence in the selected subset of features that will yield realistic results with less time and effort. FS-PPC is able to retain classification performance and meaningful features while consisting of non-redundant features. The proposed methods have been practically applied to analysis of real clinical data and their effectiveness has been assessed. The results show that the proposed methods are enable to minimise the clinical data problems whilst, at the same time, maximising classification performance of data.

Citation

Poolsawad, N. Practical approaches to mining of clinical datasets : from frameworks to novel feature selection. (Thesis). University of Hull. https://hull-repository.worktribe.com/output/4215841

Thesis Type Thesis
Deposit Date Jul 21, 2014
Publicly Available Date Feb 23, 2023
Keywords Computer science
Public URL https://hull-repository.worktribe.com/output/4215841
Additional Information Department of Computer Science, The University of Hull
Award Date May 1, 2014

Files

Thesis (6.2 Mb)
PDF

Copyright Statement
© 2014 Poolsawad, Nongnuch. All rights reserved. No part of this publication may be reproduced without the written permission of the copyright holder.




You might also like



Downloadable Citations