Prediction of donor splice sites using random forest with a new sequence encoding approach

Meher, Prabina Kumar; Sahu, Tanmaya Kumar; Rao, Atmakuri Ramakrishna

doi:10.1186/s13040-016-0086-4

BioData Mining

Table 2 Comparison of the performance of RF, SVM and ANN under all encoding procedures with both balanced and imbalanced training dataset

From: Prediction of donor splice sites using random forest with a new sequence encoding approach

EP	MLA	Balanced Dataset							Imbalanced Dataset
EP	MLA	TPR	TNR	F (α = 1)	F (β = 2)	G-mean	WA	MCC	TPR	TNR	F (α = 1)	F (β = 2)	G-mean	WA	MCC
P-1	RF	0.954	0.924	0.940	0.932	0.939	0.939	0.878	0.842	0.896	0.865	0.880	0.869	0.869	0.739
	RF	(0.014)	(0.014)	(0.010)	(0.012)	(0.010)	(0.010)	(0.020)	(0.064)	(0.018)	(0.032)	(0.049)	(0.030)	(0.028)	(0.043)
	SVM	0.935	0.930	0.933	0.931	0.933	0.933	0.865	0.104	0.982	0.185	0.349	0.320	0.543	0.180
	SVM	(0.015)	(0.017)	(0.015)	(0.015)	(0.016)	(0.016)	(0.031)	(0.027)	(0.018)	(0.041)	(0.031)	(0.040)	(0.013)	(0.061)
	ANN	0.892	0.896	0.894	0.895	0.894	0.894	0.787	0.032	0.988	0.061	0.136	0.178	0.510	0.068
	ANN	(0.064)	(0.080)	(0.063)	(0.062)	(0.066)	(0.065)	(0.129)	(0.026)	(0.010)	(0.046)	(0.032)	(0.065)	(0.011)	(0.055)
P-2	RF	0.937	0.901	0.920	0.911	0.919	0.919	0.838	0.883	0.894	0.888	0.891	0.888	0.889	0.777
	RF	(0.020)	(0.016)	(0.016)	(0.018)	(0.016)	(0.016)	(0.033)	(0.038)	()	(0.025)	(0.030)	(0.019)	(0.019)	(0.035)
	SVM	0.720	0.773	0.740	0.752	0.746	0.746	0.493	0.321	0.989	0.482	0.689	0.563	0.655	0.417
	SVM	(0.029)	(0.106)	(0.041)	(0.026)	(0.049)	(0.051)	(0.108)	(0.051)	(0.008)	(0.055)	(0.053)	(0.043)	(0.025)	(0.048)
	ANN	0.775	0.777	0.776	0.776	0.776	0.776	0.552	0.305	0.978	0.460	0.661	0.546	0.642	0.383
	ANN	(0.067)	(0.037)	(0.049)	(0.059)	(0.048)	(0.045)	(0.090)	(0.049)	(0.014)	(0.052)	(0.051)	(0.043)	(0.022)	(0.046)
P-3	RF	0.940	0.908	0.925	0.917	0.924	0.924	0.848	0.879	0.891	0.884	0.888	0.885	0.885	0.770
	RF	(0.017)	(0.015)	(0.012)	(0.014)	(0.012)	(0.012)	(0.246)	(0.044)	(0.022)	(0.029)	(0.034)	(0.022)	(0.022)	(0.042)
	SVM	0.789	0.807	0.796	0.800	0.798	0.798	0.595	0.249	0.988	0.395	0.609	0.496	0.619	0.352
	SVM	(0.044)	(0.068)	(0.042)	(0.042)	(0.046)	(0.045)	(0.090)	(0.052)	(0.008)	(0.062)	(0.056)	(0.049)	(0.026)	(0.055)
	ANN	0.757	0.760	0.758	0.759	0.759	0.759	0.517	0.272	0.979	0.421	0.626	0.516	0.626	0.355
	ANN	(0.118)	(0.099)	(0.067)	(0.098)	(0.057)	(0.048)	(0.086)	(0.066)	(0.009)	(0.081)	(0.072)	(0.064)	(0.034)	(0.076)

The values inside the brackets () are the standard errors
EP encoding procedure, MLA machine learning approaches

Back to article page

ISSN: 1756-0381

Contact us

General enquiries: journalsubmissions@springernature.com