1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16from mltrain.supervised.RandomForest import RandomForest # Initialize the model model = RandomForest(n_trees=100, max_depth=10, min_samples_split=2, criteria='gini') # Train the model model.train(X_train, y_train) # Make predictions predictions = model.predict(X_test) # Calculate accuracy accuracy = model.accuracy(y_test, predictions) # Generate confusion matrix conf_matrix = model.confusion_matrix(y_test, predictions)
The RandomForest class implements a Random Forest model for classification tasks. This class constructs multiple decision trees using bootstrap sampling and aggregates their predictions to improve classification accuracy and robustness. It supports various hyperparameters to control the behavior of individual trees and the overall forest.
n_trees (default=100): The number of decision trees in the forest.max_depth (default=10): The maximum depth of each decision tree.min_samples_split (default=2): The minimum number of samples required to split a node in each tree.criteria (default='gini'): The criterion used to evaluate splits ('gini' or 'entropy').trees (list): A list of trained DecisionTree instances.__init__(self, n_trees=100, max_depth=10, min_samples_split=2, criteria='gini')Initializes the Random Forest model with the specified hyperparameters.
n_trees (int): Number of trees in the forest.max_depth (int): Maximum depth of each tree.min_samples_split (int): Minimum number of samples required to split a node.criteria (str): Criterion used to evaluate splits ('gini' or 'entropy').bootstrap_sample(self, X, y)Generates a bootstrap sample (random sample with replacement) from the dataset.
X (numpy.ndarray): The input features.y (numpy.ndarray): The target labels.Tuple[numpy.ndarray, numpy.ndarray]: Bootstrap sample of features and target labels.most_common_label(self, y)Determines the most common label in the target array.
y (numpy.ndarray): The array of target labels.Any: The most common label in the target array.train(self, X, y)Trains the random forest by creating and training multiple decision trees.
X (numpy.ndarray): The training dataset.y (numpy.ndarray): The target labels for the training dataset.Nonepredict(self, X)Predicts class labels for the given dataset using the trained random forest.
X (numpy.ndarray): The dataset for which to make predictions.numpy.ndarray: An array of predicted class labels.accuracy(self, y_true, y_pred)Calculates the accuracy of the model based on true and predicted labels.
y_true (numpy.ndarray): True target labels.y_pred (numpy.ndarray): Predicted target labels.float: The accuracy of the predictions.confusion_matrix(self, y_true, y_pred)Generates a confusion matrix to evaluate the accuracy of the classification.
y_true (numpy.ndarray): True target labels.y_pred (numpy.ndarray): Predicted target labels.numpy.ndarray: A confusion matrix.