• 1. School of Computer Science, Minnan Normal University, Zhangzhou, Fujian 363000, P. R. China;
  • 2. School of Information Science and Engineering, Ningbo University, Ningbo, Zhejiang 315211, P. R. China;
  • 3. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, P. R. China;
WEI Zhisen, Email: zs.wei@foxmail.com
Export PDF Favorites Scan Get Citation

Protein lysine β-hydroxybutyrylation (Kbhb) is a newly discovered post-translational modification associated with a wide range of biological processes. Identifying Kbhb sites is critical to better understanding its mechanism of action. However, biochemical experimental methods for probing Kbhb sites are costly and have a long cycle. Therefore, a feature embedding learning method based on the Transformer encoder was proposed to predict Kbhb sites. In this method, amino acid residues were mapped into numerical vectors according to their amino acid class and position in a learnable feature embedding method, and then the Transformer encoder was used to extract discriminating features, and the bidirectional long short-term memory network (BiLSTM) was used to capture the correlation between different features. In this paper, a benchmark dataset was constructed, and a Kbhb site predictor, AutoTF-Kbhb, was implemented based on the proposed method. Experimental results showed that the proposed feature embedding learning method could extract effective features. AutoTF-Kbhb achieved an area under curve (AUC) of 0.87 and a Matthews correlation coefficient (MCC) of 0.37 on the independent test set, significantly outperforming other methods in comparison. Therefore, AutoTF-Kbhb can be used as an auxiliary means to identify Kbhb sites.

Copyright © the editorial department of Journal of Biomedical Engineering of West China Medical Publisher. All rights reserved