Nongnuch Poolsawad
Practical approaches to mining of clinical datasets : from frameworks to novel feature selection
Poolsawad, Nongnuch
Abstract
Research has investigated clinical data that have embedded within them numerous complexities and uncertainties in the form of missing values, class imbalances and high dimensionality. The research in this thesis was motivated by these challenges to minimise these problems whilst, at the same time, maximising classification performance of data and also selecting the significant subset of variables. As such, this led to the proposal of a data mining framework and feature selection method. The proposed framework has a simple algorithmic framework and makes use of a modified form of existing frameworks to address a variety of different data issues, called the Handling Clinical Data Framework (HCDF). The assessment of data mining techniques reveals that missing values imputation and resampling data for class balancing can improve the performance of classification. Next, the proposed feature selection method was introduced; it involves projecting onto principal component method (FS-PPC) and draws on ideas from both feature extraction and feature selection to select a significant subset of features from the data. This method selects features that have high correlation with the principal component by applying symmetrical uncertainty (SU). However, irrelevant and redundant features are removed by using mutual information (MI). However, this method provides confidence in the selected subset of features that will yield realistic results with less time and effort. FS-PPC is able to retain classification performance and meaningful features while consisting of non-redundant features. The proposed methods have been practically applied to analysis of real clinical data and their effectiveness has been assessed. The results show that the proposed methods are enable to minimise the clinical data problems whilst, at the same time, maximising classification performance of data.
Citation
Poolsawad, N. (2014). Practical approaches to mining of clinical datasets : from frameworks to novel feature selection. (Thesis). University of Hull. Retrieved from https://hull-repository.worktribe.com/output/4215841
Thesis Type | Thesis |
---|---|
Deposit Date | Jul 21, 2014 |
Publicly Available Date | Feb 23, 2023 |
Keywords | Computer science |
Public URL | https://hull-repository.worktribe.com/output/4215841 |
Additional Information | Department of Computer Science, The University of Hull |
Award Date | May 1, 2014 |
Files
Thesis
(6.2 Mb)
PDF
Copyright Statement
© 2014 Poolsawad, Nongnuch. All rights reserved. No part of this publication may be reproduced without the written permission of the copyright holder.
You might also like
Disease progression in chronic heart failure is linear: Insights from multistate modelling
(2024)
Journal Article
A LDA-Based Social Media Data Mining Framework for Plastic Circular Economy
(2024)
Journal Article
Locally fitting hyperplanes to high-dimensional data
(2022)
Journal Article
Downloadable Citations
About Repository@Hull
Administrator e-mail: repository@hull.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search