Identification of RNA-binding protein residues has made significant function in several areas such as protein function, posttranscriptional regulation and drug design studies. So in this work, a novel method PRBR (Prediction of RNA Binding Residues) is proposed for identifying RNA-binding residues from protein sequences which combines a hybrid feature with the enriched random forest (ERF) algorithm.
The hybrid feature is composed of the secondary structure information and other three novel features including the evolutionary information combining with conservation information of physicochemical properties of amino acids and the information about dependency of amino acids with regards to polarity-charge and hydrophobicity in protein sequences.
The result demonstrates that our model achieves 0.563 Matthew's correlation coefficient (MCC) and 88.63% overall accuracy (ACC) with 53.70% sensitivity (SE) and 96.97% specificity (SP) respectively.
Furthermore, performance comparison of each feature indicates that these three novel features all contribute most to the prediction improvement. Compared to other approaches, it is clearly that PRBR has significant better prediction performance of RNA-binding residues in proteins.