Decision Surface And Plotting
Introduction
Well, we all things to be more transparent. We all want to know the inner something that’s inside. The same goes for the classification model. We merely depend on the performance metrics to weigh the model performance. However, visualizing the classification results has its charm and gives a clearer picture of how models classify results.
A popular diagnostic for visualizing the decisions made by a classification model is decision surface/boundary. A decision surface is a plot that shows how a fit machine learning algorithm divides the input feature space by class label.
A decision surface is a powerful tool for understanding how a given model visualizes the prediction and how it decides to divide the input feature space by class label.
Decision Surface
Classification in machine learning means training our data to assign labels in our input dataset.
Each input feature defines an axis on the feature space. The minimum number of features required to form a plane is two, with dots representing input coordinates in the feature space. If there were three input variables, the feature space would be a threedimensional volume.
The motive of the classification model is to separate the feature space so that we can decide the class label for points in the feature space with minimum error.
This separation is done by decision surface or boundary, and it works as a demonstrative tool for visualizing the model on a classification predictive modeling task.
The data points lying to one side of the decision surface belong to one class label to those lying on the other side of the surface. Due to the model learning process, we can create or modify decision boundaries.
Although the word ‘surface’ suggests a 2D feature space, we can still use these methods for more than two features by creating a decision surface for each pair of input features.
Now, let's look at the implementation part to get a clearer picture. We will be using logistic regression classifier for our implementation.
Implementation
We will be using Breast Cancer Wisconsin(Diagnostic) dataset for our work.
Importing all the necessary libraries
import pandas as pd import numpy as np 
Reading the dataset and displaying first 10 rows.
df = pd.read_csv('data.csv') df.head(10) 
Performing Label Encoding on Target Feature
df['diagnosis'] = df['diagnosis'].map({'B':0, 'M':1}) 
Dropping the Unwanted Columns
df.drop(['id', 'Unnamed: 32'], axis = 1, inplace = True) 
Differentiating Dependent and Independent Features
x = df.iloc[:, 1:] y = df['diagnosis'] 
x 
Splitting the Input Dataset
x_train, x_test, y_train, y_test = train_test_split(x, y, train_size = 0.6,test_size = 0.4, random_state = 1234) 
We split the dataset into a 6040 ratio.
Feature Scaling
columns = x_train.columns scalerx = StandardScaler() x_train_scaled = scalerx.fit_transform(x_train) x_train_scaled = pd.DataFrame(x_train_scaled, columns = columns) x_test_scaled = scalerx.transform(x_test) x_test_scaled = pd.DataFrame(x_test_scaled, columns = columns) 
x_train_scaled 
PCA
We reduce the dimension of the feature to two,as we can plot decision surface for 2d grid.
pca = IncrementalPCA(n_components = 2) x_train_pca = pca.fit_transform(x_train_scaled) x_test_pca = pca.transform(x_test_scaled) 
Incremental Principal Component Analysis selects two features to explain as much variance as possible. We apply PCA in both testing and training data.
Plotting the ScatterPlot for both Training and Testing dataset
plt.figure(figsize = (20, 6)) plt.subplot(1, 2, 1) plt.scatter(x_train_pca[:,0], x_train_pca[:,1], c = y_train) plt.xlabel('Training 1st Principal Component') plt.ylabel('Training 2nd Principal Component') plt.title('Training Set Scatter Plot with labels indicated by colors, (0)Violet,(1)Yellow') plt.subplot(1, 2, 2) plt.scatter(x_test_pca[:,0], x_test_pca[:,1], c = y_test) plt.xlabel('Test 1st Principal Component') plt.ylabel('Test 2nd Principal Component') plt.title('Test Set Scatter Plot with labels indicated by colors, (0)Violet (1)Yellow') plt.show() 
We can see the distinction between the two classes and possibly imagine the decision surface, maybe a correct diagonal between the two.
Performing CrossValidation
params = {'C':[0.01, 0.1, 1, 10, 100]} clf = LogisticRegression() folds = 5 model_cv = GridSearchCV(estimator = clf, param_grid = params, scoring= 'accuracy', cv = folds, return_train_score=True, verbose = 3) model_cv.fit(x_train_pca, y_train) 
We perform a 5fold gridsearch crossvalidation on logistic regression classifier on the training set.
Best Hyperparameter from GridSearch CV performed Above
print(model_cv.best_params_) 
Output
{'C': 10} 
ReTraining the model with best parameters
model = LogisticRegression(C = 10).fit(x_train_pca, y_train) 
We retrain our model with c=10 obtained after performing hyperparameter tuning.
Predictions
y_train_pred = model.predict(x_train_pca) y_test_pred = model.predict(x_test_pca) 
Performance Analysis of the model in terms of different performance metrics
print('Training Accuracy of the Model: ', metrics.accuracy_score(y_train, y_train_pred)) print('Test Accuracy of the Model: ', metrics.accuracy_score(y_test, y_test_pred)) print() print('Training Precision of the Model: ', metrics.precision_score(y_train, y_train_pred)) print('Test Precision of the Model: ', metrics.precision_score(y_test, y_test_pred)) 
Visualization of Decision Surface
We can create a decision boundary by fitting the model on the training data, then using the same model to make predictions for a grid of values for the input domain.
Once we have the grid of predictions, we can plot the values and their class label.
The best possible approach to visualize decision boundaries is to use a contour plot that can interpolate the colors between the points. We can use the contourf()function for plotting the decision surface.
We have to follow specific steps.
Firstly, we need to define the grid points in the whole feature space.
To do this, first, we find the maximum and minimum values for each feature and increase it by one step beyond that to ensure that the whole feature space is covered.
x_min, x_max = x_train_pca[:, 0].min()  1, x_train_pca[:, 0].max() + 1 y_min, y_max = x_train_pca[:, 1].min()  1, x_train_pca[:, 1].max() + 1 
An arrange() function creates a uniform sample at a particular resolution across each dimension. We will use the meshgrid() function to create a grid of the two input vectors.
xx_train, yy_train = np.meshgrid(np.arange(x_min, x_max, 0.1), Z_train = model.predict(np.c_[xx_train.ravel(), yy_train.ravel()]) Z_train = Z_train.reshape(xx_train.shape)

Similarly for test dataset.
x_min, x_max = x_test_pca[:, 0].min()  1, x_test_pca[:, 0].max() + 1 Z_test = model.predict(np.c_[xx_test.ravel(), yy_test.ravel()]) Z_test = Z_test.reshape(xx_test.shape)

Now we have grid values across the feature space.
The contourf() function takes different grids for each of the axes. Then, we plot the decision surface with a twocolor colormap.
plt.figure(figsize = (20, 6)) plt.subplot(1, 2, 1) plt.contourf(xx_train, yy_train, Z_train) plt.scatter(x_train_pca[:, 0], x_train_pca[:, 1], c = y_train, s = 30, edgecolor = 'k') plt.xlabel('Training 1st Principal Component') plt.ylabel('Training 2nd Principal Component') plt.title('Scatter Plot with Decision Boundary for the Training Set') plt.subplot(1, 2, 2) plt.contourf(xx_test, yy_test, Z_test) plt.scatter(x_test_pca[:, 0], x_test_pca[:, 1], c = y_test, s = 30, edgecolor = 'k') plt.xlabel('Test 1st Principal Component') plt.ylabel('Test 2nd Principal Component') plt.title('Scatter Plot with Decision Boundary for the Test Set') plt.show() 
So we can see how the contourf() function plotted a beautiful decision boundary. With the help of the above screenshot, we can visualize how the input features are assigned their class labels.
Frequently Asked Questions

How is the optimal decision boundary determined?
A classification problem is a rule that partitions the features and assigns features of a partition to the same class. The ‘boundary’ of this partitioning is the decision boundary of the rule. The boundary that this rule produces is the optimal decision boundary.

How do you determine decision boundaries in logistic regression?
The logistic regression decision boundary is the set of all points that satisfy the equation P(y=1x)=P(y=0x)=½.

How does decision tree decision boundary differs from that of logistic regression?
Logistic regression decision boundary divides the feature space into precisely two halves with the help of a single line, whereas decision trees divide the space into smaller and smaller areas.

What kind of decision boundary is built by a logistic regression classifier?
In the case of logistic regression, the decision boundary is a straight line,i.e., it comes up with a hyperplane live SVM that divides the feature space into two different classes.
Key Takeaways
Let us brief the article,
Firstly, we saw a decision boundary, enhancing our visualization and providing a clear explanation of how data inputs are classified. Lastly, we saw how to implement a decision boundary using logistic regression classifier.
I recommend you all apply the same steps using another classification model to understand it better. Thus we can use decision surface and performance metrics to evaluate the model's performance.
That is the end of the article. Stay tuned for more exciting articles.
Keep Learning Ninjas!