Lung cancer is one of the tumors with the highest incidence rate and mortality rate in the world. It is also the malignant tumor with the fastest growing number of patients, which seriously threatens human life. How to improve the accuracy of diagnosis and treatment of lung cancer and the survival prognosis is particularly important. Machine learning is a multi-disciplinary interdisciplinary specialty, covering the knowledge of probability theory, statistics, approximate theory and complex algorithm. It uses computer as a tool and is committed to simulating human learning methods, and divides the existing content into knowledge structures to effectively improve learning efficiency and being able to integrate computer science and statistics into medical problems. Through the introduction of algorithm to absorb the input data, and the application of computer analysis to predict the output value within the acceptable accuracy range, identify the patterns and trends in the data, and finally learn from previous experience, the development of this technology brings a new direction for the diagnosis and treatment of lung cancer. This article will review the performance and application prospects of different types of machine learning algorithms in the clinical diagnosis and survival prognosis analysis of lung cancer.
Sleep electroencephalogram (EEG) is an important index in diagnosing sleep disorders and related diseases. Manual sleep staging is time-consuming and often influenced by subjective factors. Existing automatic sleep staging methods have high complexity and a low accuracy rate. A sleep staging method based on support vector machines (SVM) and feature selection using single channel EEG single is proposed in this paper. Thirty-eight features were extracted from the single channel EEG signal. Then based on the feature selection method F-Score's definition, it was extended to multiclass with an added eliminate factor in order to find proper features, which were used as SVM classifier inputs. The eliminate factor was adopted to reduce the negative interaction of features to the result. Research on the F-Score with an added eliminate factor was further accomplished with the data from a standard open source database and the results were compared with none feature selection and standard F-Score feature selection. The results showed that the present method could effectively improve the sleep staging accuracy and reduce the computation time.
We in the present research proposed a classification method that applied infomax independent component analysis (ICA) to respectively extract single modality features of structural magnetic resonance imaging (sMRI) and positron emission tomography (PET). And then we combined these two features by using a method of weight combination. We found that the present method was able to improve the accurate diagnosis of Alzheimer's disease (AD) and mild cognitive impairment (MCI). Compared AD to healthy controls (HC): the study achieved a classification accuracy of 93.75%, with a sensitivity of 100% and a specificity of 87.64%. Compared MCI to HC: classification accuracy was 89.35%, with a sensitivity of 81.85% and a specificity of 99.36%. The experimental results showed that the bi-modality method performed better than the individual modality in comparison to classification accuracy.
This study is aimed to investigate objective indicators of mental fatigue evaluation to improve the accuracy of mental fatigue evaluation. Mental fatigue was induced by a sustained cognitive task. The brain functional networks in two states (normal state and mental fatigue state) were constructed based on electroencephalogram (EEG) data. This study used complex network theory to calculate and analyze nodal characteristics parameters (degree, betweenness centrality, clustering coefficient and average path length of node), and served them as the classification features of support vector machine (SVM). Parameters of the SVM model were optimized by gird search based on 6-fold cross validation. Then, the subjects were classified. The results show that characteristic parameters of node of brain function networks can be divided into normal state and mental fatigue state, which can be used in the objective evaluation of mental fatigue state.
Focused on the world-wide issue of improving the accuracy of emotion recognition, this paper proposes an electroencephalogram (EEG) signal feature extraction algorithm based on wavelet packet energy entropy and auto-regressive (AR) model. The auto-regressive process can be approached to EEG signal as much as possible, and provide a wealth of spectral information with few parameters. The wavelet packet entropy reflects the spectral energy distribution of the signal in each frequency band. Combination of them gives a better reflect of the energy characteristics of EEG signals. Feature extraction and fusion are implemented based on kernel principal component analysis. Six emotional states from a public multimodal database for emotion analysis using physiological signals (DEAP) are recognized. The results show that the recognition accuracy of the proposed algorithm is more than 90%, and the highest recognition accuracy is 99.33%. It indicates that this algorithm can extract the feature of EEG emotion well, and it is a kind of effective emotion feature extraction algorithm, providing support to emotion recognition.
The purpose of using brain-computer interface (BCI) is to build a bridge between brain and computer for the disable persons, in order to help them to communicate with the outside world. Electroencephalography (EEG) has low signal to noise ratio (SNR), and there exist some problems in the traditional methods for the feature extraction of EEG, such as low classification accuracy, lack of spatial information and huge amounts of features. To solve these problems, we proposed a new method based on time domain, frequency domain and space domain. In this study, independent component analysis (ICA) and wavelet transform were used to extract the temporal, spectral and spatial features from the original EEG signals, and then the extracted features were classified with the method combined support vector machine (SVM) with genetic algorithm (GA). The proposed method displayed a better classification performance, and made the mean accuracy of the Graz datasets in the BCI Competitions of 2003 reach 96%. The classification results showed that the proposed method with the three domains could effectively overcome the drawbacks of the traditional methods based solely on time-frequency domain when the EEG signals were used to describe the characteristics of the brain electrical signals.
Electroencephalography (EEG) signals are strongly correlated with human emotions. The importance of nodes in the emotional brain network provides an effective means to analyze the emotional brain mechanism. In this paper, a new ranking method of node importance, weighted K-order propagation number method, was used to design and implement a classification algorithm for emotional brain networks. Firstly, based on DEAP emotional EEG data, a cross-sample entropy brain network was constructed, and the importance of nodes in positive and negative emotional brain networks was sorted to obtain the feature matrix under multi-threshold scales. Secondly, feature extraction and support vector machine (SVM) were used to classify emotion. The classification accuracy was 83.6%. The results show that it is effective to use the weighted K-order propagation number method to extract the importance characteristics of brain network nodes for emotion classification, which provides a new means for feature extraction and analysis of complex networks.
Autoimmune pancreatitis (AIP) is a unique subtype of chronic pancreatitis, which shares many clinical presentations with pancreatic ductal adenocarcinoma (PDA). The misdiagnosis of AIP often leads to unnecessary pancreatic resection. 18F-FDG positron emission tomography/ computed tomography (PET/CT) could provide comprehensive information on the morphology, density, and functional metabolism of the pancreas at the same time. It has been proved to be a promising modality for noninvasive differentiation between AIP and PDA. However, there is a lack of clinical analysis of PET/CT image texture features. Difficulty still remains in differentiating AIP and PDA based on commonly used diagnostic methods. Therefore, this paper studied the differentiation of AIP and PDA based on multi-modality texture features. We utilized multiple feature extraction algorithms to extract the texture features from CT and PET images at first. Then, the Fisher criterion and sequence forward floating selection algorithm (SFFS) combined with support vector machine (SVM) was employed to select the optimal multi-modality feature subset. Finally, the SVM classifier was used to differentiate AIP from PDA. The results prove that texture analysis of lesions helps to achieve accurate differentiation of AIP and PDA.
The clinical manifestations of patients with schizophrenia and patients with depression not only have a certain similarity, but also change with the patient's mood, and thus lead to misdiagnosis in clinical diagnosis. Electroencephalogram (EEG) analysis provides an important reference and objective basis for accurate differentiation and diagnosis between patients with schizophrenia and patients with depression. In order to solve the problem of misdiagnosis between patients with schizophrenia and patients with depression, and to improve the accuracy of the classification and diagnosis of these two diseases, in this study we extracted the resting-state EEG features from 100 patients with depression and 100 patients with schizophrenia, including information entropy, sample entropy and approximate entropy, statistical properties feature and relative power spectral density (rPSD) of each EEG rhythm (δ, θ, α, β). Then feature vectors were formed to classify these two types of patients using the support vector machine (SVM) and the naive Bayes (NB) classifier. Experimental results indicate that: ① The rPSD feature vector P performs the best in classification, achieving an average accuracy of 84.2% and a highest accuracy of 86.3%; ② The accuracy of SVM is obviously better than that of NB; ③ For the rPSD of each rhythm, the β rhythm performs the best with the highest accuracy of 76%; ④ Electrodes with large feature weight are mainly concentrated in the frontal lobe and parietal lobe. The results of this study indicate that the rPSD feature vector P in conjunction with SVM can effectively distinguish depression and schizophrenia, and can also play an auxiliary role in the relevant clinical diagnosis.
Individual differences of P300 potentials lead to that a large amount of training data must be collected to construct pattern recognition models in P300-based brain-computer interface system, which may cause subjects’ fatigue and degrade the system performance. TrAdaBoost is a method that transfers the knowledge from source area to target area, which improves learning effect in the target area. Our research purposed a TrAdaBoost-based linear discriminant analysis and a TrAdaBoost-based support vector machine to recognize the P300 potentials across multiple subjects. This method first trains two kinds of classifiers separately by using the data deriving from a small amount of data from same subject and a large amount of data from different subjects. Then it combines all the classifiers with different weights. Compared with traditional training methods that use only a small amount of data from same subject or mixed different subjects’ data to directly train, our algorithm improved the accuracies by 19.56% and 22.25% respectively, and improved the information transfer rate of 14.69 bits/min and 15.76 bits/min respectively. The results indicate that the TrAdaBoost-based method has the potential to enhance the generalization ability of brain-computer interface on the individual differences.