Sleep stage scoring is a hotspot in the field of medicine and neuroscience. Visual inspection of sleep is laborious and the results may be subjective to different clinicians. Automatic sleep stage classification algorithm can be used to reduce the manual workload. However, there are still limitations when it encounters complicated and changeable clinical cases. The purpose of this paper is to develop an automatic sleep staging algorithm based on the characteristics of actual sleep data. In the proposed improved K-means clustering algorithm, points were selected as the initial centers by using a concept of density to avoid the randomness of the original K-means algorithm. Meanwhile, the cluster centers were updated according to the 'Three-Sigma Rule' during the iteration to abate the influence of the outliers. The proposed method was tested and analyzed on the overnight sleep data of the healthy persons and patients with sleep disorders after continuous positive airway pressure (CPAP) treatment. The automatic sleep stage classification results were compared with the visual inspection by qualified clinicians and the averaged accuracy reached 76%. With the analysis of morphological diversity of sleep data, it was proved that the proposed improved K-means algorithm was feasible and valid for clinical practice.
ObjectiveTo predict the total hospitalization expenses of bronchopneumonia inpatients in a tertiay hospital of Sichuan Province through BP neural network and support vector machine models, and analyze the influencing factors.MethodsThe home page information of 749 cases of bronchopneumonia discharged from a tertiay hospital of Sichuan Province in 2017 was collected and compiled. The BP neural network model and the support vector machine model were simulated by SPSS 20.0 and Clementine softwares respectively to predict the total hospitalization expenses and analyze the influencing factors.ResultsThe accuracy rate of the BP neural network model in predicting the total hospitalization expenses was 81.2%, and the top three influencing factors and their importances were length of hospital stay (0.477), age (0.154), and discharge department (0.083). The accuracy rate of the support vector machine model in predicting the total hospitalization expenses was 93.4%, and the top three influencing factors and their importances were length of hospital stay (0.215), age (0.196), and marital status (0.172), but after stratified analysis by Mantel-Haenszel method, the correlation between marital status and total hospitalization expenses was not statistically significant (χ2=0.137, P=0.711).ConclusionsThe BP neural network model and the support vector machine model can be applied to predicting the total hospitalization expenses and analyzing the influencing factors of patients with bronchopneumonia. In this study, the prediction effect of the support vector machine is better than that of the BP neural network model. Length of hospital stay is an important influencing factor of total hospitalization expenses of bronchopneumonia patients, so shortening the length of hospital stay can significantly lighten the economic burden of these patients.
Magnetic resonance (MR) images can be used to detect lesions in the brains of patients with multiple sclerosis (MS). An automatic method is presented for segmentation of MS lesions using multispectral MR images in this paper. Firstly, a Pd-w image is subtracted from its corresponding T1-w images to get an image in which the cerebral spinal fluid (CSF) is enhanced. Secondly, based on kernel fuzzy c-means clustering (KFCM) algorithm, the enhanced image and the corresponding T2-w image are segmented respectively to extract the CSF region and the CSF-MS lesions combinatoin region. A raw MS lesions image is obtained by subtracting the CSF region from CSF-MS region. Thirdly, based on applying median filter and thresholding to the raw image, the MS lesions were detected finally. Results were tested on BrainWeb images and evaluated with Dice similarity coefficient (DSC), sensitivity (Sens), specificity (Spec) and accuracy (Acc). The testing results were satisfactory.
Accurate segmentation of pulmonary nodules is an important basis for doctors to determine lung cancer. Aiming at the problem of incorrect segmentation of pulmonary nodules, especially the problem that it is difficult to separate adhesive pulmonary nodules connected with chest wall or blood vessels, an improved random walk method is proposed to segment difficult pulmonary nodules accurately in this paper. The innovation of this paper is to introduce geodesic distance to redefine the weights in random walk combining the coordinates of the nodes and seed points in the image with the space distance. The improved algorithm is used to achieve the accurate segmentation of pulmonary nodules. The computed tomography (CT) images of 17 patients with different types of pulmonary nodules were selected for segmentation experiments. The experimental results are compared with the traditional random walk method and those of several literatures. Experiments show that the proposed method has good accuracy in the segmentation of pulmonary nodule, and the accuracy can reach more than 88% with segmentation time is less than 4 seconds. The results could be used to assist doctors in the diagnosis of benign and malignant pulmonary nodules and improve clinical efficiency.
The traditional method of multi-parameter flow data clustering in flow cytometry is to mainly use professional software to manually set the door and circle out the target cells for analysis. The analysis process is complex and professional. Based on this, a clustering algorithm, which is based on t-distributed stochastic neighbor embedding (t-SNE) algorithm for multi-parameter stream data, is proposed in the paper. In this algorithm, the Euclidean distance of sample data in high dimensional space is transformed into conditional probability to represent similarity, and the data is reduced to low dimensional space. In this paper, the stained human peripheral blood cells were treated by flow cytometry, and the processed data were derived as experimental sample data. Thet-SNE algorithm is compared with the kernel principal component analysis (KPCA) dimensionality reduction algorithm, and the main component data obtained by the dimensionality reduction are classified using K-means algorithm. The results show that thet-SNE algorithm has a good clustering effect on the cell population with asymmetric and trailing distribution, and the clustering accuracy can reach 92.55%, which may be helpful for automatic analysis of multi-color multi-parameter flow data.
Objective To classify the nursing needs of patients undergoing ophthalmic day surgery, to understand the characteristics and needs of different patient groups, and propose specific nursing strategies to further improve the nursing quality of the ophthalmic day wards. Methods A retrospective review was conducted on all archived electronic medical records of patients in the Ophthalmology Day Ward of Beijing Tongren Hospital affiliated to the Capital Medical University from January to September 2023. Statistical description and cluster analysis were used to analyze and cluster all data. Results A total of 52049 patients were included, with an average age of (57.11±19.61) years. The number of nursing items required was 0 for 3104 patients (5.96%), 1 for 9158 patients (17.59%), 2 for 25428 patients (48.85%), 3 for 8812 patients (16.93%), 4 for 5442 patients (10.46%), and 5-11 for 105 patients (0.20%). The number of patients’ comorbidities was 0 for 38653 patients (74.26%), 1 for 10896 patients (20.93%), 2 for 2449 patients (4.71%), and 3-11 for 51 patients (0.10%). Using the number of comorbidities, total required nursing care items, and age as clustering variables, the 52049 patients were divided into 3 groups: low nursing demand group with 11817 patients (22.70%), medium nursing demand group with 24466 patients (47.01%), and high nursing demand group with 15766 patients (30.29%). The results showed that both patient age and the number of comorbidities were closely related to the number of nursing care items needed. Conclusion Classifying and analyzing the nursing needs of patients undergoing ophthalmic day surgery can help understand the needs of different categories of patients, improve nursing strategies specifically, provide support for further improving the accuracy and quality of ophthalmic day care services, and provide reference for clinical nursing work.
The diagnosis of pancreatic cancer is very important. The main method of diagnosis is based on pathological analysis of microscopic image of Pap smear slide. The accurate segmentation and classification of images are two important phases of the analysis. In this paper, we proposed a new automatic segmentation and classification method for microscopic images of pancreas. For the segmentation phase, firstly multi-features Mean-shift clustering algorithm (MFMS) was applied to localize regions of nuclei. Then, chain splitting model (CSM) containing flexible mathematical morphology and curvature scale space corner detection method was applied to split overlapped cells for better accuracy and robustness. For classification phase, 4 shape-based features and 138 textural features based on color spaces of cell nuclei were extracted. In order to achieve optimal feature set and classify different cells, chain-like agent genetic algorithm (CAGA) combined with support vector machine (SVM) was proposed. The proposed method was tested on 15 cytology images containing 461 cell nuclei. Experimental results showed that the proposed method could automatically segment and classify different types of microscopic images of pancreatic cell and had effective segmentation and classification results. The mean accuracy of segmentation is 93.46%±7.24%. The classification performance of normal and malignant cells can achieve 96.55%±0.99% for accuracy, 96.10%±3.08% for sensitivity and 96.80%±1.48% for specificity.
The deoxyribonucleic acid (DNA) molecule damage simulations with an atom level geometric model use the traversal algorithm that has the disadvantages of quite time-consuming, slow convergence and high-performance computer requirement. Therefore, this work presents a density-based spatial clustering of applications with noise (DBSCAN) clustering algorithm based on the spatial distributions of energy depositions and hydroxyl radicals (·OH). The algorithm with probability and statistics can quickly get the DNA strand break yields and help to study the variation pattern of the clustered DNA damage. Firstly, we simulated the transportation of protons and secondary particles through the nucleus, as well as the ionization and excitation of water molecules by using Geant4-DNA that is the Monte Carlo simulation toolkit for radiobiology, and got the distributions of energy depositions and hydroxyl radicals. Then we used the damage probability functions to get the spatial distribution dataset of DNA damage points in a simplified geometric model. The DBSCAN clustering algorithm based on damage points density was used to determine the single-strand break (SSB) yield and double-strand break (DSB) yield. Finally, we analyzed the DNA strand break yield variation trend with particle linear energy transfer (LET) and summarized the variation pattern of damage clusters. The simulation results show that the new algorithm has a faster simulation speed than the traversal algorithm and a good precision result. The simulation results have consistency when compared to other experiments and simulations. This work achieves more precise information on clustered DNA damage induced by proton radiation at the molecular level with high speed, so that it provides an essential and powerful research method for the study of radiation biological damage mechanism.
Due to the minimum free energy model, it is very important to predict the RNA secondary structure accurately and efficiently from the suboptimal foldings. Using clustering techniques in analyzing the suboptimal structures could effectively improve the prediction accuracy. An improved k-medoids cluster method is proposed to make this a better accuracy with the RBP score and the incremental candidate set of medoids matrix in this paper. The algorithm optimizes initial medoids through an expanding medoids candidate sets gradually.The predicted results indicated this algorithm could get a higher value of CH and significantly shorten the time for calculating clustering RNA folding structures.
At present, the incidence of Parkinson’s disease (PD) is gradually increasing. This seriously affects the quality of life of patients, and the burden of diagnosis and treatment is increasing. However, the disease is difficult to intervene in early stage as early monitoring means are limited. Aiming to find an effective biomarker of PD, this work extracted correlation between each pair of electroencephalogram (EEG) channels for each frequency band using weighted symbolic mutual information and k-means clustering. The results showed that State1 of Beta frequency band (P = 0.034) and State5 of Gamma frequency band (P = 0.010) could be used to differentiate health controls and off-medication Parkinson’s disease patients. These findings indicated that there were significant differences in the resting channel-wise correlation states between PD patients and healthy subjects. However, no significant differences were found between PD-on and PD-off patients, and between PD-on patients and healthy controls. This may provide a clinical diagnosis reference for Parkinson’s disease.