Support Vector Machine

Feature: supervised ; Common use: classification ; Output: class label ; Key Parameters: C(regularization), kernel, gamma

Support vector machines (SVMs) are a set of supervised learning methods used for classificationarrow-up-right, regressionarrow-up-right and outliers detectionarrow-up-right.

The advantages of support vector machines are:

  • Effective in high dimensional spaces.

  • Still effective in cases where number of dimensions is greater than the number of samples.

  • Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.

  • Versatile: different Kernel functionsarrow-up-right can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels.

The disadvantages of support vector machines include:


What does SVM do?

SVM finds the best boundary (hyperplane) that separates classes with the maximum margin.

  • For 2D data, this boundary is a line.

  • For 3D+, it's a hyperplane.

  • The support vectors are the data points closest to this boundary and most critical in defining it.


Key Concepts

Term
Meaning

Hyperplane

Decision boundary that separates the classes

Margin

Distance between the hyperplane and the closest support vectors

Support Vectors

Data points closest to the hyperplane that influence its position

Kernel Trick

Technique that transforms data into a higher dimension to make it separable when it's not in original space


Types of Kernels:

Used to handle non-linear problems:

  • Linear: Default, used when data is linearly separable.

  • Polynomial: Useful for curved boundaries.

  • RBF (Radial Basis Function) / Gaussian: Most common for non-linear data.


SVC (Support Vector Classification)

SVC is the classification implementation of Support Vector Machine (SVM) in Scikit-learn, the popular Python machine learning library.


Usage

Last updated