Fonaments
Foundational mental models that are essential for understanding how neural networks work.
Introducció
Section titled “Introducció”We’ll work our way up from the simplest possible building blocks to show that we can build complicated functions made up of a “chain” of constituent functions and, even when one of these functions is a matrix multiplication that takes in multiple inputs, compute the derivative of the functions’ outputs with respect to their inputs.
uv init mathcd mathuv add numpy
Functions
Section titled “Functions”As with neural nets, there are several ways to describe functions, none of which individually paints a complete picture.
Here are two examples of functions
This notation says that the functions, which we arbitrarily call f1
and f2
, take in a number x
as input and transform it into either x2
(in the first case) or max(x, 0)
(in the second case).
També pots escriure aquestes funcions en Python:
def f1(x): return x ** 2
def f2(x): return max(x, 0)
assert f1(2) == 4assert f2(2) == 2
Diagram
Section titled “Diagram”One way of depicting functions is to:
-
Draw an x-y plane (where x refers to the horizontal axis and y refers to the vertical axis).
-
Plot a bunch of points, where the x-coordinates of the points are (usually evenly spaced) inputs of the function over some range, and the y-coordinates are the outputs of the function over that range.
-
Connect these plotted points.
. I’ll create a simple visualization showing the parabola in a coordinate system.
However, there is another way to depict functions that isn’t as useful when learning calculus but that will be very useful for us when thinking about deep learning models.
We can think of functions as boxes that take in numbers as input and produce numbers as output, like minifactories that have their own internal rules for what happens to the input.
This figure shows both these functions described as general rules and how they operate on specific inputs
Derivatives
Section titled “Derivatives”Derivatives, like functions, are an extremely important concept for understanding deep learning that many of you are probably familiar with. Also like functions, they can be depicted in multiple ways. We’ll start by simply saying at a high level that the derivative of a function at a point is the “rate of change” of the output of the function with respect to its input at that point. Let’s now walk through the same three perspectives on derivatives that we covered for functions to gain a better mental model for how derivatives work.
First, we’ll get mathematically precise: we can describe this number — how much the output of f changes as we change its input at a particular value a of the input — as a limit:
This limit can be approximated numerically by setting a very small value for Δ, such as 0.001, so we can compute the derivative as:
While accurate, this is only one part of a full mental model of derivatives.
Finally, we can code up the approximation to the derivative that we saw previously:
from numpy import ndarrayfrom typing import Callable
def derive(func: Callable[[ndarray], ndarray], input_: ndarray, delta: float = 0.001) -> ndarray: ''' Evaluates the derivative of a function "func" at every element in the "input_" array. ''' return (func(input_ + delta) - func(input_ - delta)) / (2 * delta)
El contingut d'aquest lloc web té llicència CC BY-NC-ND 4.0.
©2022-2025 xtec.dev