In this article, we are going to talk about the support vector machine(also known as SVMs) algorithm in machine learning.
Support vector machine is one of the most common and widely used algorithms in machine learning. It is a supervised machine learning algorithm and is a linear model used for both classifications as well as regression problems, though you would find it mostly being used in classification problems. The idea behind SVM can be understood quite easily. SVM creates a line or a hyperplane between different classes thus, separating them as shown below.
The line here is the separating boundary or classifier because it separates the two classes. In this blog post, I am going to get you through a high-level overview of this algorithm. I will discuss how SVMs work, the theory behind it, its applications, and an example implementation of SVM in python.
How does it work?
Earlier we talked about how SVM draws a line or a hyperplane to classify. The most important question here is how do we identify that line or the right hyper-plane. Don’t worry, it’s not as hard as you might think it to be.
We will visualise and learn it from a few examples so as to get the idea of what kind of lines or hyperplanes there are and how do we choose the right one.
Let’s take a look at the examples below one by one so as to get a better idea:
Example 1: In this example, we have to identify the hyperplane which clearly separates the two classes of objects. Here, hyperplane B does the task of separating the two classes very well and therefore we choose hyperplane B.
Example 2: In this example, we have three hyper-planes (A, B, and C) and all are separating the classes well. How do we identify here which hyperplane does the best job of segregating the two classes as all three do it?
According to the SVM algorithm, we find the points closest to the line from both the classes. These points are called support vectors. Now, we compute the distance between the line and the support vectors. This distance is called the margin. Our goal is to maximise the margin. The hyperplane for which the margin is maximum is the optimal hyperplane.
In the figure above, we can see that the margin for hyper-plane C is high as compared to both A and B. Hence, we name the right hyper-plane as C. Another lightning reason for selecting the hyper-plane with a higher margin is robustness. If we select a hyper-plane having a low margin then there is a high chance of miss-classification.
Example 3: Now, this one’s a tricky one. According to what we learned in the last example, hyperplane B seems like the optimal hyperplane as it has a maximum margin from both the classes. But wait, here is the catch, SVM selects the hyper-plane which classifies the classes accurately prior to maximising margin. Here, hyper-plane B has clearly a classification error and A has classified all correctly. Therefore, the right hyper-plane is A even though it has a lower margin value than hyperplane B.
Example 4: This is an example in which we see an outlier(as one of the stars lies in the territory of the other class i.e. the circles). As I have already mentioned, one star at the other end is like an outlier for star class. The SVM algorithm has a feature to ignore outliers and find the hyperplane that has the maximum margin. Therefore, we can say, SVM classification is robust to outliers.
Implement SVM in Python
Code: Here, I will use a small example to demonstrate how SVMs are used to classification problems. We have used sci-kit and few other libraries for this. Let’s get to code:
import numpy as np X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]]) y = np.array([1, 1, 2, 2])
In the above three lines of code, we have defined two NumPy arrays. X has the points and Y has the classes to which these points belong to. Now, let us create our SVM model using sklearn.svm. Here, I choose the linear kernel.
from sklearn.svm import SVC clf = SVC(kernel='linear')
We now fit out classifier(clf) to the data points we defined.
To predict the class of a new dataset
prediction = clf.predict([[0,6]])
This would return us the prediction(a class to which the data belongs). Voila! It is simple to use an SVM for simple classification problems.
Pros and Cons associated with Support vector machines
Pros of SVMs:
- It works well with a clear dividing line or margin of separation
- It is effective in high dimensional spaces
- It is effective in cases where the number of dimensions is greater than the number of samples.
- It uses a subset of training points in the decision function (called support vectors), so it is also memory efficient
Cons of SVMs:
- It doesn’t perform well when we have a large data set because the required training time is higher
- It also doesn’t perform very well, when the data set has more noise i.e. target classes are overlapping
- SVM doesn’t directly provide probability estimates, these are calculated using an expensive five-fold cross-validation. It is included in the related SVC method of Python sci-kit-learn library
Real-world applications of Support Vector Machines:
Here are some real-world applications of support vector machines listed
1. Face Detection: SVM separates parts of the image as facial and non-facial and forms a square border around the face.
2. Text and Hypertext Categorisation: SVMs allow text and hypertext categorisation for both inductive and transductive models. They use training data to classify documents into different categories. It categorises on the basis of the score generated and then compares with the threshold value.
3. Classification of images: We have already discussed that SVMs are widely used in image classification problems. It provides better accuracy for image classification and image search in comparison to the formerly used query-based searching approaches.
4. Bioinformatics: SVMs are really popular in medical and bioinformatics and are used in protein classification and cancer classification problems. It is used for identifying the classification of genes, patients on the basis of genes, and other biological problems like skin cancer.
5. Protein fold and remote homology detection: SVM algorithms are also widely used in protein remote homology detection.
6. Handwriting recognition: SVMs are used to recognise handwritten characters and work with a wide variety of languages.
7. Generalised Predictive Control (GPC): Use SVM based GPC to control chaotic dynamics with useful parameters and hyperparameters.
By Alok Singh