Support Vector Machines With Example

code walkthrough, graphs, metrics, and practice tasks

SVM Code and Description

Please type code into your code window,
instead of copying and pasting
-this can help you understand the process better

Section 1: Imports

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

We import dataset tools, train/test split, scaling, SVM classifier, and evaluation metrics.

Section 2: Load Data

data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

print("Shape:", X.shape)
print("Target classes:", np.unique(y))

Breast cancer data is a binary classification dataset, suitable for SVM practice.

Section 3: Split and Scale

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

SVM is sensitive to feature scale, so standardization is required for stable performance.

Section 4: Train SVM Model

svm_model = SVC(kernel="rbf", C=2.0, gamma="scale", random_state=42)
svm_model.fit(X_train_scaled, y_train)

We start with RBF kernel because relationships are often non-linear in real datasets.

Section 5: Predict and Evaluate

y_pred = svm_model.predict(X_test_scaled)

acc = accuracy_score(y_test, y_pred)
cm = confusion_matrix(y_test, y_pred)

print("Accuracy:", round(acc, 4))
print("Confusion Matrix:\n", cm)
print("Classification Report:\n", classification_report(y_test, y_pred))

Accuracy gives the overall score; confusion matrix and classification report show class-level behavior.

Section 6: Quick Hyperparameter Comparison

for c_val in [0.5, 1.0, 2.0, 5.0]:
    model = SVC(kernel="rbf", C=c_val, gamma="scale", random_state=42)
    model.fit(X_train_scaled, y_train)
    pred = model.predict(X_test_scaled)
    print(f"C={c_val}: accuracy={accuracy_score(y_test, pred):.4f}")

This helps you see how changing C affects generalization.

Second Practical Example (Single Block): Digits Dataset (Multiclass SVM)

from sklearn.datasets import load_digits

# 1) Load
digits = load_digits()
Xd = pd.DataFrame(digits.data)
yd = pd.Series(digits.target, name="digit_class")

# 2) Split + Scale
Xd_train, Xd_test, yd_train, yd_test = train_test_split(
    Xd, yd, test_size=0.2, random_state=42, stratify=yd
)
scaler_d = StandardScaler()
Xd_train_scaled = scaler_d.fit_transform(Xd_train)
Xd_test_scaled = scaler_d.transform(Xd_test)

# 3) Train
svm_digits = SVC(kernel="rbf", C=5.0, gamma="scale", random_state=42)
svm_digits.fit(Xd_train_scaled, yd_train)

# 4) Evaluate
pred_d = svm_digits.predict(Xd_test_scaled)
print("Digits Accuracy:", round(accuracy_score(yd_test, pred_d), 4))
print("Digits Confusion Matrix:\\n", confusion_matrix(yd_test, pred_d))
print("Digits Classification Report:\\n", classification_report(yd_test, pred_d))

# 5) Quick C sensitivity
for c_val in [1.0, 2.0, 5.0, 10.0]:
    m = SVC(kernel="rbf", C=c_val, gamma="scale", random_state=42)
    m.fit(Xd_train_scaled, yd_train)
    p = m.predict(Xd_test_scaled)
    print(f"Digits C={c_val}: accuracy={accuracy_score(yd_test, p):.4f}")

This single block keeps the same SVM workflow but on image-derived multiclass data (digits 0-9), which is a strong practical use case. Check the confusion matrix carefully to see which digits are commonly confused, even when overall accuracy is high. Use the C sensitivity lines to choose a stable range before deeper tuning.

Graphs and Analysis

Graph 1: Linear vs RBF Decision Boundary

Open PDF: Support Vector Machines Code output file

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.svm import SVC

X2, y2 = make_moons(n_samples=250, noise=0.20, random_state=42)

def plot_boundary(ax, model, X, y, title):
    model.fit(X, y)
    xx, yy = np.meshgrid(
        np.linspace(X[:, 0].min() - 1, X[:, 0].max() + 1, 300),
        np.linspace(X[:, 1].min() - 1, X[:, 1].max() + 1, 300)
    )
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
    ax.contourf(xx, yy, Z, alpha=0.20, cmap="coolwarm")
    ax.scatter(X[:, 0], X[:, 1], c=y, cmap="coolwarm", s=22, edgecolor="k")
    ax.set_title(title)

fig, axes = plt.subplots(1, 2, figsize=(12, 5))
lin = make_pipeline(StandardScaler(), SVC(kernel="linear", C=1.0))
rbf = make_pipeline(StandardScaler(), SVC(kernel="rbf", C=1.0, gamma="scale"))

plot_boundary(axes[0], lin, X2, y2, "Linear SVM")
plot_boundary(axes[1], rbf, X2, y2, "RBF SVM")
plt.tight_layout()
plt.show()

This graph shows why kernels matter: linear SVM draws a straight split; RBF can curve to match non-linear class shape.

Graph 2: Maximum Margin and Support Vectors

from sklearn.datasets import make_blobs
from sklearn.svm import SVC

Xb, yb = make_blobs(n_samples=120, centers=2, cluster_std=1.2, random_state=7)
svc_lin = SVC(kernel="linear", C=1.0)
svc_lin.fit(Xb, yb)

w = svc_lin.coef_[0]
b = svc_lin.intercept_[0]
xx = np.linspace(Xb[:, 0].min() - 1, Xb[:, 0].max() + 1, 200)
yy = -(w[0] * xx + b) / w[1]
yy_down = -(w[0] * xx + b - 1) / w[1]
yy_up = -(w[0] * xx + b + 1) / w[1]

plt.figure(figsize=(7, 5))
plt.scatter(Xb[:, 0], Xb[:, 1], c=yb, cmap="coolwarm", s=28, edgecolor="k")
plt.plot(xx, yy, "k-", label="Decision boundary")
plt.plot(xx, yy_down, "k--", alpha=0.75, label="Margin")
plt.plot(xx, yy_up, "k--", alpha=0.75)
plt.scatter(
    svc_lin.support_vectors_[:, 0],
    svc_lin.support_vectors_[:, 1],
    s=140, facecolors="none", edgecolors="black", linewidth=1.8, label="Support vectors"
)
plt.title("Linear SVM: Margin and Support Vectors")
plt.legend()
plt.show()

This is the most important SVM picture: the middle line is the boundary, dashed lines are margins, circled points are support vectors.

Graph 3: Effect of C (Underfit vs Overfit Behavior)

c_values = [0.1, 1, 50]
fig, axes = plt.subplots(1, 3, figsize=(15, 4.5))

for i, cval in enumerate(c_values):
    model = make_pipeline(StandardScaler(), SVC(kernel="rbf", C=cval, gamma="scale"))
    model.fit(X2, y2)

    xx, yy = np.meshgrid(
        np.linspace(X2[:, 0].min() - 1, X2[:, 0].max() + 1, 250),
        np.linspace(X2[:, 1].min() - 1, X2[:, 1].max() + 1, 250)
    )
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)

    axes[i].contourf(xx, yy, Z, alpha=0.20, cmap="coolwarm")
    axes[i].scatter(X2[:, 0], X2[:, 1], c=y2, cmap="coolwarm", s=22, edgecolor="k")
    axes[i].set_title(f"RBF SVM, C={cval}")

plt.tight_layout()
plt.show()

Lower C gives smoother boundary (more tolerance). Very high C can create overly tight fitting. This makes C tuning intuitive.

Graph 4: Effect of Gamma (Local vs Global Influence)

gamma_values = [0.1, 1, 10]
fig, axes = plt.subplots(1, 3, figsize=(15, 4.5))

for i, gval in enumerate(gamma_values):
    model = make_pipeline(StandardScaler(), SVC(kernel="rbf", C=1.0, gamma=gval))
    model.fit(X2, y2)

    xx, yy = np.meshgrid(
        np.linspace(X2[:, 0].min() - 1, X2[:, 0].max() + 1, 250),
        np.linspace(X2[:, 1].min() - 1, X2[:, 1].max() + 1, 250)
    )
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)

    axes[i].contourf(xx, yy, Z, alpha=0.20, cmap="coolwarm")
    axes[i].scatter(X2[:, 0], X2[:, 1], c=y2, cmap="coolwarm", s=22, edgecolor="k")
    axes[i].set_title(f"RBF SVM, gamma={gval}")

plt.tight_layout()
plt.show()

Gamma controls how far each point influences the boundary: small gamma = broad smooth influence, large gamma = very local wiggles.

Exercises for Practice

Exercise 1: Replace RBF with kernel="linear" and compare metrics.

Exercise 2: Add class_weight="balanced" and check class-wise recall changes.

Exercise 3: Run a small grid search over C and gamma and report best parameters.

Exercise 4: Use make_classification() to generate synthetic data and visualize decision boundaries.