Cancer gene expression data have the characteristics of high dimensionalities and small samples so it is necessary to perform dimensionality reduction of the data. Traditional linear dimensionality reduction approaches can not find the nonlinear relationship between the data points. In addition, they have bad dimensionality reduction results. Therefore a multiple weights locally linear embedding (LLE) algorithm with improved distance is introduced to perform dimensionality reduction in this study. We adopted an improved distance to calculate the neighbor of each data point in this algorithm, and then we introduced multiple sets of linearly independent local weight vectors for each neighbor, and obtained the embedding results in the low-dimensional space of the high-dimensional data by minimizing the reconstruction error. Experimental result showed that the multiple weights LLE algorithm with improved distance had good dimensionality reduction functions of the cancer gene expression data.
In order to solve the problem of early classification of Alzheimer’s disease (AD), the conventional linear feature extraction algorithm is difficult to extract the most discriminative information from the high-dimensional features to effectively classify unlabeled samples. Therefore, in order to reduce the redundant features and improve the recognition accuracy, this paper used the supervised locally linear embedding (SLLE) algorithm to transform multivariate data of regional brain volume and cortical thickness to a locally linear space with fewer dimensions. The 412 individuals were collected from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) including stable mild cognitive impairment (sMCI, n = 93), amnestic mild cognitive impairment (aMCI, n = 96), AD (n = 86) and cognitive normal controls (CN, n = 137). The SLLE algorithm used in this paper is to calculate the nearest neighbors of each sample point by adding the distance correction term, and the locally linear reconstruction weight matrix was obtained from its nearest neighbors, then the low dimensional mapping of the high dimensional data can be calculated. In order to verify the validity of SLLE in the task of classification, the feature extraction algorithms such as principal component analysis (PCA), Neighborhood MinMax Projection (NMMP), locally linear mapping (LLE) and SLLE were respectively combined with support vector machines (SVM) classifier to obtain the accuracy of classification of CN and sMCI, CN and aMCI, CN and AD, sMCI and aMCI, sMCI and AD, and aMCI and AD, respectively. Experimental results showed that our method had improvements (accuracy/sensitivity/specificity: 65.16%/63.33%/67.62%) on the classification of sMCI and aMCI by comparing with the combination algorithm of LLE and SVM (accuracy/sensitivity/specificity: 64.08%/66.14%/62.77%) and SVM (accuracy/sensitivity/specificity: 57.25%/56.28%/58.08%). In detail the accuracy of the combination algorithm of SLLE and SVM is 1.08% higher than the combination algorithm of LLE and SVM, and 7.91% higher than SVM. Thus, the combination of SLLE and SVM is more effective in the early diagnosis of Alzheimer’s disease.