Classification

Train neural networks to assign discrete labels — binary and multi-class classification with PyTorch.

Goal of the lesson

By the end of this 3-hour session you should be able to:

explain the difference between regression and classification,
generate synthetic 2-D datasets and visualize their decision regions,
build a feed-forward neural network with non-linear activations,
choose the right loss for binary and multi-class problems,
track loss and accuracy during training,
recognize underfitting and overfitting visually,
handle real-world tabular data with mixed numerical and categorical features,
solve the moons dataset as a capstone.

Suggested timing

Block	Topic
15 min	What classification is, logits vs. probabilities
25 min	Generate the blobs dataset, build the model
25 min	Training loop with accuracy, decision boundary
15 min	Binary classification with `BCEWithLogitsLoss`
55 min	Real-world example — heart-disease prediction
45 min	Capstone — moons dataset and overfitting

Regression vs. classification

Task	Output	Loss	Final layer
Regression	A real number	`MSELoss`, `L1Loss`	`Linear` (no activation)
Binary classification	One of two classes	`BCEWithLogitsLoss`	`Linear` with 1 output (logit)
Multi-class classification	One of `K` classes	`CrossEntropyLoss`	`Linear` with `K` outputs (logits)

The five-step workflow doesn’t change. We swap the dataset, the model’s output size, and the loss.

Setup

PowerShell

uv init --python 3.12 classification
cd classification
uv add torch matplotlib scikit-learn numpy

python

main.py

import matplotlib.pyplot as plt
import numpy as np
import torch
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
from torch import nn

device = "cuda" if torch.cuda.is_available() else "cpu"
torch.manual_seed(42)

Multi-class — the blobs dataset

sklearn.datasets.make_blobs generates clusters of points in 2-D — perfect for visualizing what a classifier is doing.

python

main.py

NUM_CLASSES = 4
NUM_FEATURES = 2

x_np, y_np = make_blobs(
    n_samples=1000,
    n_features=NUM_FEATURES,
    centers=NUM_CLASSES,
    cluster_std=1.5,
    random_state=42,
)

x = torch.from_numpy(x_np).float()
y = torch.from_numpy(y_np).long()

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

print(x_train.shape, y_train.shape, y_train[:10])

A few details that matter:

Features are float32. Targets for CrossEntropyLoss must be int64 (the dtype .long() produces).
Targets are class indices (0, 1, 2, 3), not one-hot vectors. PyTorch’s loss does the one-hot conversion internally.

Visualize:

python

main.py

plt.scatter(x[:, 0], x[:, 1], c=y, cmap=plt.cm.RdYlBu, s=8)
plt.title("blobs")
plt.show()

Estàs llegint una vista prèvia.

Inicia sessió amb Google per llegir la pàgina completa.

Inicia sessió amb Google

Encara no has enllaçat el compte? Entra un cop a l'Escola amb el Moodle del centre.

Classification

Goal of the lesson#

Suggested timing#

Regression vs. classification#

Setup#

Multi-class — the blobs dataset#

Goal of the lesson

Suggested timing

Regression vs. classification

Setup

Multi-class — the blobs dataset