EnrichRBP.evaluateClassifiers¶

This two functions are used to evaluate the validity of the sequence representations obtained in the preceding process.

EnrichRBP.evaluateClassifiers.evaluateDLclassifers(features, labels, file_path='', shuffle=True, folds=5)¶

EnrichRBP integrates four classical deep learning models (CNN, RNN, MLP and ResNet), cross-validates them using the representation matrix on the four classes of models, and stores the final performance metrics obtained for each model in DL_evalution_metrics.csv.

Parameters:

features:numpy array, necessary parameters: Sequence feature matrix for training the four deep learning models.

labels:numpy array, necessary parameters: The label corresponding to each sequence (which indicates whether the corresponding sequence is the target sequence of the RBPs).

file_path:str, default='': Path for storing cross-validation result files.

shuffle:bool, default=True: Whether to perform disorder when dividing sequence subsets used for cross-validation.

folds:int, default=5: Cross-validated folds, which divides the training set into 5 (or other values) subsets, where one subset is the validation set, and the other fold - 1 subsets constitute the training set. Each subset needs to be performed once as a validation set.

EnrichRBP.evaluateClassifiers.evaluateMLclassifers(features, labels, file_path='', shuffle=True, folds=5)¶

EnrichRBP integrates eleven classical machine learning models (Logistic Regression, K-Nearest Neighbor, Decision Tree, GaussianNB, Bagging, Random Forest, AdaBoost, Gradient Boosting, SVM, LDA and ExtRa Trees), cross-validates them using the representation matrix on each model, and stores the final performance metrics obtained for each model in ML_evalution_metrics.csv.

Parameters:

features:numpy array, necessary parameters: Sequence feature matrix for training the machine learning models.

labels:numpy array, necessary parameters: The label corresponding to each sequence (which indicates whether the corresponding sequence is the target sequence of the RBPs).

file_path:str, default='': Path for storing cross-validation result files.

shuffle:bool, default=True: Whether to perform disorder when dividing sequence subsets used for cross-validation.

folds:int, default=5: Cross-validated folds, which divides the training set into 5 (or other values) subsets, where one subset is the validation set, and the other fold - 1 subsets constitute the training set. Each subset needs to be performed once as a validation set.