1from mltrain.supervised.RandomForest import RandomForest 2 3# Initialize the model 4model = RandomForest(n_trees=100, max_depth=10, min_samples_split=2, criteria='gini') 5 6# Train the model 7model.train(X_train, y_train) 8 9# Make predictions 10predictions = model.predict(X_test) 11 12# Calculate accuracy 13accuracy = model.accuracy(y_test, predictions) 14 15# Generate confusion matrix 16conf_matrix = model.confusion_matrix(y_test, predictions) 17
The RandomForest
class implements a Random Forest model for classification tasks. This class constructs multiple decision trees using bootstrap sampling and aggregates their predictions to improve classification accuracy and robustness. It supports various hyperparameters to control the behavior of individual trees and the overall forest.
n_trees
(default=100): The number of decision trees in the forest.max_depth
(default=10): The maximum depth of each decision tree.min_samples_split
(default=2): The minimum number of samples required to split a node in each tree.criteria
(default='gini'): The criterion used to evaluate splits ('gini' or 'entropy').trees
(list): A list of trained DecisionTree
instances.__init__(self, n_trees=100, max_depth=10, min_samples_split=2, criteria='gini')
Initializes the Random Forest model with the specified hyperparameters.
n_trees
(int): Number of trees in the forest.max_depth
(int): Maximum depth of each tree.min_samples_split
(int): Minimum number of samples required to split a node.criteria
(str): Criterion used to evaluate splits ('gini' or 'entropy').bootstrap_sample(self, X, y)
Generates a bootstrap sample (random sample with replacement) from the dataset.
X
(numpy.ndarray): The input features.y
(numpy.ndarray): The target labels.Tuple[numpy.ndarray, numpy.ndarray]
: Bootstrap sample of features and target labels.most_common_label(self, y)
Determines the most common label in the target array.
y
(numpy.ndarray): The array of target labels.Any
: The most common label in the target array.train(self, X, y)
Trains the random forest by creating and training multiple decision trees.
X
(numpy.ndarray): The training dataset.y
(numpy.ndarray): The target labels for the training dataset.None
predict(self, X)
Predicts class labels for the given dataset using the trained random forest.
X
(numpy.ndarray): The dataset for which to make predictions.numpy.ndarray
: An array of predicted class labels.accuracy(self, y_true, y_pred)
Calculates the accuracy of the model based on true and predicted labels.
y_true
(numpy.ndarray): True target labels.y_pred
(numpy.ndarray): Predicted target labels.float
: The accuracy of the predictions.confusion_matrix(self, y_true, y_pred)
Generates a confusion matrix to evaluate the accuracy of the classification.
y_true
(numpy.ndarray): True target labels.y_pred
(numpy.ndarray): Predicted target labels.numpy.ndarray
: A confusion matrix.