Fonaments

Foundational mental models that are essential for understanding how neural networks work.

Introducció

We’ll work our way up from the simplest possible building blocks to show that we can build complicated functions made up of a “chain” of constituent functions and, even when one of these functions is a matrix multiplication that takes in multiple inputs, compute the derivative of the functions’ outputs with respect to their inputs.

uv init math
cd math
uv add numpy

Functions

As with neural nets, there are several ways to describe functions, none of which individually paints a complete picture.

Math

Here are two examples of functions

This notation says that the functions, which we arbitrarily call f₁ and f₂, take in a number x as input and transform it into either x² (in the first case) or max(x, 0) (in the second case).

Code

També pots escriure aquestes funcions en Python:

def f1(x):
    return x ** 2

def f2(x):
    return max(x, 0)

assert f1(2) == 4
assert f2(2) == 2

Diagram

One way of depicting functions is to:

Draw an x-y plane (where x refers to the horizontal axis and y refers to the vertical axis).
Plot a bunch of points, where the x-coordinates of the points are (usually evenly spaced) inputs of the function over some range, and the y-coordinates are the outputs of the function over that range.
Connect these plotted points.

. I’ll create a simple visualization showing the parabola in a coordinate system.

Box

However, there is another way to depict functions that isn’t as useful when learning calculus but that will be very useful for us when thinking about deep learning models.

We can think of functions as boxes that take in numbers as input and produce numbers as output, like minifactories that have their own internal rules for what happens to the input.

This figure shows both these functions described as general rules and how they operate on specific inputs

Derivatives

Derivatives, like functions, are an extremely important concept for understanding deep learning that many of you are probably familiar with. Also like functions, they can be depicted in multiple ways. We’ll start by simply saying at a high level that the derivative of a function at a point is the “rate of change” of the output of the function with respect to its input at that point. Let’s now walk through the same three perspectives on derivatives that we covered for functions to gain a better mental model for how derivatives work.

Math

First, we’ll get mathematically precise: we can describe this number — how much the output of f changes as we change its input at a particular value a of the input — as a limit:

This limit can be approximated numerically by setting a very small value for Δ, such as 0.001, so we can compute the derivative as:

While accurate, this is only one part of a full mental model of derivatives.

Codi

Finally, we can code up the approximation to the derivative that we saw previously:

from numpy import ndarray
from typing import Callable


def derive(func: Callable[[ndarray], ndarray], input_: ndarray, delta: float = 0.001) -> ndarray:
    '''
    Evaluates the derivative of a function "func" at every element in the "input_" array.
    '''
    return (func(input_ + delta) - func(input_ - delta)) / (2 * delta)

El contingut d'aquest lloc web té llicència CC BY-NC-ND 4.0.