• Home
  • Docs
  • About






  • PCA (Principle Component Analysis)

    Example Usage

    1from mltrain.unsupervised.PCA import PCA 2import numpy as np 3 4# Initialize the model 5pca = PCA(n_components=2) 6 7# Fit the model and transform the data and plot graph too. 8transformed_X = pca.train_transform(X_train, plot_graph=True) 9 10# Get the principal components 11principal_components = pca.pc 12 13 14

    Overview

    The PCA class implements Principal Component Analysis (PCA), a technique for dimensionality reduction. PCA transforms data into a new coordinate system where the axes (principal components) are ordered by the amount of variance they capture from the data.


    Hyperparameters

    • n_components (int, default=2): The number of principal components to retain after dimensionality reduction.

    Attributes

    • pc (numpy.ndarray): The principal components (eigenvectors) after fitting the model.
    • mean (numpy.ndarray): The mean of the features in the original data.

    Methods

    __init__(self, n_components=2)

    Initializes the PCA model with the specified number of components.

    • Args:
      • n_components (int): Number of principal components to retain.

    train(self, X)

    Fits the PCA model to the input data.

    • Args:
      • X (numpy.ndarray): The input data to perform PCA on, with shape (n_samples, n_features).
    • Returns:
      • numpy.ndarray: The principal components after fitting the model.
    • Raises:
      • ValueError: If the number of components is greater than the number of features.

    transform(self, X)

    Applies the dimensionality reduction on the input data.

    • Args:
      • X (numpy.ndarray): The input data to transform, with shape (n_samples, n_features).
    • Returns:
      • numpy.ndarray: The data transformed into the principal component space.

    train_transform(self, X, plot_graph=False)

    Fits the PCA model and transforms the input data in one step. Optionally, plots the data in the reduced principal component space.

    • Args:
      • X (numpy.ndarray): The input data to fit and transform, with shape (n_samples, n_features).
      • plot_graph (bool, optional): Whether to plot the transformed data. Only works for 1, 2, or 3 components.
    • Returns:
      • numpy.ndarray: The data transformed into the principal component space.