EFFICIENT PREDICTION OF DNA-BINDING

DNA-binding proteins are a class of proteins which have a specific or general affinity to DNA and

include three important components: transcription factors; nucleases, and histones. DNA-binding

proteins also perform important roles in many types of cellular activities. In this paper we describe

machine learning systems for the prediction of DNA- binding proteins where a Support Vector Machine

and a Cascade Correlation Neural Network are optimized and then compared to determine the learning

algorithm that achieves the best prediction performance. The information used for classification is

derived from characteristics that include overall charge, patch size and amino acids composition. In total

121 DNA- binding proteins and 238 non-binding proteins are used to build and evaluate the system. For

SVM using the ANOVA Kernel with Jack-knife evaluation, an accuracy of 86.7% has been achieved with

91.1% for sensitivity and 85.3% for specificity. For CCNN optimized over the entire dataset with Jack

knife evaluation we report an accuracy of 75.4%, while the values of specificity and sensitivity achieved

were 72.3% and 82.6%, respectively.

 

 

 

 

Comments are closed.

Thanks for downloading!

Top