Selection Method for Medical Data set

Selection Method for Medical Data set

Feature selection methods for medical data can be defined as various features that are considered when selecting data storage method with an aim of ensuring that the method is reliable and data can be retrieved fast when needed for various medical purposes. The method can be used to store the progress of the patient.  Majority of feature selection methods are usually based on various theories of selection. One of these theories that are largely in use is the rough set. The feature selection method is very crucial as it helps in storage of data that is much needed to solve various medical related problems.  It ensures that the stored data can be easily retrieved when needed, and there is no time wastage in terms of storage space, as well as time consumed in various computations restricting the application of the method that is chosen for medical dataset. The following are the various types of methods that are used in medical data; filter method, wrapper method and finally the hybrid method.Selection Method for Medical Data set

Feature Selection Methods on Cardiotocography dataset

There is little research that talks about the feature selection methods on the various carditocography dataset. The cardiotocography data set consists of the various measurements taken on fetal heart rates, as well as the uterine contractions characteristics on the cardiotocograms, and then classified by an expert obstetrician.

The fetal cardiotograms are usually robotically processed, then the particular analytical characteristics are measured.  The feature selection methods that are used in the cardiotocography data set have been discussed in this paper. The electronic fetal monitoring, known as the constant recoding of the various cardiotocograms, usually consists of fetal heart rates as well as the tocographic signals.


Electronic fetal monitoring is one of the methods that are used in the intrapartum  analysis of the health of the fetus. In this section the analysis of the various subsets features are used for result categorization of a pregnancy, which is based on the recording of CTG during the last 20 minutes prior to the actual delivery. The various subset features are developed depending on PCA gain information, as well as the GAME (group of adaptive models evolution), which is a neural system feature choice algorithm. According to various researches, the best form of subset should be the one that constitutes of a mix of non-linear as well as the time domain features. The mix tends to perform constantly over the entire data sets with high level of sensitivity as well as specificity at a level of 70%, this level is very equivalent to the interobserver variations. The paper’s following section will include the studying of individual feature selection methods for medical dataset. Selection Method for Medical Data set

Filter Method for Medical Data

Medical applications are usually characterized by huge numbers of illness makers as well as a very small number of records. Research has shown that a complete aspect ranking that is followed by choice, usually results into noteworthy reduction in data’s dimensionality, with notable improvements during the implementation, as well as performance of the various classifiers for the sole purpose of medical diagnosis.  The paper describes the use of a novel advance in ranking various features consistent with their qualities using a number of properties, which are unique erudition algorithms that is based on GMHD. One can adapt the usage of system network training algorithm that is continually used in selection of groups that are used as most favorable predictors.

Various GMDH methods have been put forward to operate on the entire preparation of dataset thus doing away with the need of devoted selection set. The approach of adaptive learning system/network tends to make use of the predetermined square errors criteria for selecting, as well as stopping, with an aim of avoiding model over fittings, this in turn helps in solving the problems, when stoppage is appropriate for the training of neural networks. The criteria, when used, reduce the expected errors that would result while using the network for prediction of new data.Selection Method for Medical Data set

Consensus feature ranking is a method that focuses on various medical datasets. Some of the methods that have been put forward, as been popularly used in the consensus feature ranking, usually do not take into account the various missing values, as well as the unbalanced distribution of the data. They also ignore studying the bias of the consensus ranking method in relation to various specific classifiers. One of the methods used under the consensus ranking is the ‘group method’, which handles various feature based ranking for the medical data. The method tends to use the GMHD learning algorithm with an aim of automatically selecting the best prediction features at diverse levels of the user particular model of complexity, where the ROC is to be used in evaluating the classifiers’ performance.

The implantation of the cardiac pacemaker is a complex procedure. For the procedure to succeed it depends unswervingly to the proper categorization of the patients as well as the choice made on the nature of pacing. One can use machine – learning algorithms for the purpose of supporting the process. Feature selection process is the most crucial component. Research has shown that when implementing the selection feature methods working on electrocardiological datasets, it minimizes the initial set of features by about 60%. Because of this minimization in the search space, a decrease in number of decision rules that are generated by 6 to 10 factors is observed. The result of this reduction is faster and easier practical cardiological justification of the rules , broader rules tend to adapt better for the sole purpose of recognizing new cases, as well as the computation efforts tend to be reduced. These cases have been confirmed in various clinical practices (Echauz & Vachtsevanos, 1994).Selection Method for Medical Data set

In what is known as Parkinson’s disease, an evaluation of the medical available could result to highlighting of some of the symptoms that can be useful in a harmonizing tool at the early stages of diagnosis. Various researches have been carried out to evaluate how filter feature selections algorithms, as well as the combinations of the two that are useful in determination of some important features, are related to this challenge. The studies that have been conducted, have used the data set of various patients with an aim of determining the various sets of premorbid behavior traits, which  can be useful at  the earliest stages of diagnosing Parkinsonism.

Data mining is the process by which automatically previously valid, unknown, as well as actionable, knowledge or patterns are extracted out of  big databases for important decision support. Classification analysis, which is one of the data mining approaches, is adopted in filter method. It is useful in supporting various medical decisions relating to diagnosis. They also help in improving the quality of care given to patients. It is important to note that if the training datasets tend to have irrelevant attributes, then classification analysis will end up being less accurate and may produce results, which are hard to understand. Automatic feature selection is one of the commonly used feature selection method/technique (Kondo, Pandya &, Zurada, 1999).

Feature subset selection is very crucial in the area of data mining. Due to the increased data dimensionality, training, as well as testing of the general classification techniques, is difficult. Filter method for selection usually uses classifying Pima Indian diabetic database (PIDD) model. The model usually constitutes of two phases. In the first phase genetic algorithm as well as correlation based features selection are used in a given cascaded fashions. The generic algorithm has rendered global investigation of various attributes of fitness analysis that are affected by CFS. Stage/phase two constitutes of a well, fined tuned categorization, which is usually done through the use of artificial neural network. This involves making of the features subset elicited in phase one as an input for the artificial network (Abdel-Aal RE, Mangoud &Abductive, 1996).Selection Method for Medical Data set

Studies have shown that data mining is very useful in a number of applications. One of those areas of application is in the health care systems. It is very crucial to note that a medical database tends to have huge quantities of information relating to patients plus their history of medical progress. Evaluating this huge information manually is practically impossible. The medical information may be having valuable data that may be useful in saving lives if it is analyzed and utilized properly. The technology of data mining tends to be effective in the various health care applications when used to identify patterns, as well as derivation of useful data from the medical databases. The technology of data mining is used in filtering the data from the various medical databases. The technology is used for filtering the data that is in huge quantities. The main method of feature collection that is used for data mining in this case is combination ranking search technique.

In this case, Bayesian classification methods/models are used in filtering the data for medical purposes. Application of Bayesian models are usually based on the Bayesian networks. The feature subsets selection is very useful because of its heterogeneity of the various medical databases, and not all the variable in question are used in performance of classification. The filter methods are adapted in this technique with an aim of inducing the Bayesian classifiers and are useful in distinguishing the two groups of patients suffering from cirrhotic disease (Rosa In˜aki Inza a, Marisa, Quiroga & Pedro, 2005). Selection Method for Medical Data set

Kernel F-score is another feature selection of the medical information. It is usually used as one of the pre-processing procedure in categorization of various medical datasets. KFFS usually constitutes of two stages (Kemal & Güne%u015F, 2009). The first stage is the input spaces of the various medical datasets. The feature subsets selections are of crucial significance in the area of data mining. The higher the dimension data the difficult the testing is, as well as the training of the various general categorization techniques. . Research have shown that the usage of subsets, which are selected by CFS filter, results to an improvement in both, radial basis functions network and back propagations neural networks classification correctness, when they are compared to feature subsets selected through data gain filter (Han & Kamber, 2001).


Wrapper Method for Medical Data

On the research analysis, the Parkinson’s disease has been used. It is is a severe or high chronic disease of continuous disorder; its symptoms progress that get worse over given period of time. Over a million people in USA are currently living with Parkinson’s disease. The causes of the disease remain unknown to many medical experts and researchers. Currently, there is no cure for the disease though various medical treatment methods exist. They include, among others, the popular one such as medical surgery and an everyday drug usage. The disease highly affects the brain nerves cells causing their malfunctioning. In extreme cases, the disease causes death of certain brain cells by producing a chemical called dopamine that sends message for proper coordination of the brain and other parts of the body. Parkinson’s disease has four main characteristics that can be identified in the patient.Selection Method for Medical Data set

These include postural inabilities to perform well, tremor rigidness, Brady kinesis, and brain disorders. Tremor rigidness involves continuous trembling and shaking of various body parts, among them hands, legs and even the whole body part. Brady kinesis entails slowness in individual movements and quick in becoming tired over a short distance walk. Other minor symptoms identified among the patients of the disease include brain memory loss, loss of weight, dizziness and even sometimes individuals become insane. As the disease exhibits various features, through feature selection the generated feature weights are attached, using various methods, among them is algorithm method. The feature algorithm method conducts analysis for subset using valuation method. The valuation method is run usually using a dataset that is highly internally trained and tests sets on various identified features. The feature data subset identified with the highest evolutionary value is selected and kept for further analysis of the disease in inductive algorithm.Selection Method for Medical Data set