Skip to content

Pydantic

Pydantic és una biblioteca de validació de dades que utilitza {% link “/python/typing” %}

Instal.la la biblioteca pydantic:

Terminal window
$ poetry add pydantic

Un model és una classe que hereta de BaseModel i anota amb tipus els atributs de la classe.

Són molt semblants a un @dataclass, excepte que estan pensants per:

  1. La validació i serialització de dades JSON
  2. La generació d’esquemes JSON.

Per serialitzar dades, Pydantic utilitzar una llibreria escrita en Rust: jiter

A continuació tens un exemple d’una classe User which inherit from BaseModel and define fields as annotated attributes:

from pydantic import BaseModel
class User(BaseModel):
id: int
name: str | None = None

The model can then be instantiated:

user: User = User(id=1, name="David")

Initialization of the object will perform all parsing and validation.

If no ValidationError exception is raised, you know the resulting model instance is valid:

assert user.id == 1
assert user.name == "David"

Però si escrius aquest codi, mypy et dirà que es erroni:

from pydantic import BaseModel
class User(BaseModel):
id: int
name: str | None = None
david: User = User(name="David")

I pydantic genera un error en temps d’execució

Terminal window
> python test.py
...
pydantic_core._pydantic_core.ValidationError: 1 validation error for User
id
Field required [type=missing, input_value={'name': 'David'}, input_type=dict]

A single exception will be raised regardless of the number of errors found, and that validation error will contain information about all of the errors and how they happened.

By default, models are mutable and field values can be changed through attribute assignment:

user.id = 321
assert user.id == 321

Pydantic utilitza un dict per guardar les dades: podem passar directament un “punter” a un dict per crear un User.

Si crees objectes a partir de dades de sistemes externs, no hi ha cap garantia de que siguin correctes:

from pydantic import BaseModel
from typing import Any
class User(BaseModel):
id: int
name: str | None = None
data: Any = {"id": 1, "name": "David"}
User(**data)
data = {"name": "apple", "price": 3}
User(**data) # Error de validació

Pydantic provides three methods on models classes for parsing data:

1.- model_validate()

This is very similar to the __init__ method of the model, except it takes a dictionary or an object rather than keyword arguments. If the object passed cannot be validated, or if it’s not a dictionary or instance of the model in question, a ValidationError will be raised.

from datetime import datetime
from pydantic import BaseModel, ValidationError
class User(BaseModel):
id: int
name: str = 'John Doe'
signup_ts: datetime | None = None
user = User.model_validate({'id': 123, 'name': 'James'})
print(user)
#> id=123 name='James' signup_ts=None
try:
User.model_validate(['not', 'a', 'dict'])
except ValidationError as e:
print(e)
"""
1 validation error for User
Input should be a valid dictionary or instance of User [type=model_type, input_value=['not', 'a', 'dict'], input_type=list]
"""

2.- model_validate_json()

This validates the provided data as a JSON string or bytes object. If your incoming data is a JSON payload, this is generally considered faster (instead of manually parsing the data as a dictionary).

user = User.model_validate_json('{"id": 123, "name": "James"}')
print(user)
#> id=123 name='James' signup_ts=None
try:
user = User.model_validate_json('{"id": 123, "name": 123}')
except ValidationError as e:
print(e)
"""
1 validation error for User
name
Input should be a valid string [type=string_type, input_value=123, input_type=int]
"""
try:
user = User.model_validate_json('invalid JSON')
except ValidationError as e:
print(e)
"""
1 validation error for User
Invalid JSON: expected value at line 1 column 1 [type=json_invalid, input_value='invalid JSON', input_type=str]
"""

3.- model_validate_strings()

This takes a dictionary (can be nested) with string keys and values and validates the data in JSON mode so that said strings can be coerced into the correct types.

user = User.model_validate_strings({'id': '123', 'name': 'James'})
print(user)
#> id=123 name='James' signup_ts=None
user = User.model_validate_strings(
{'id': '123', 'name': 'James', 'signup_ts': '2024-04-01T12:00:00'}
)
print(user)
#> id=123 name='James' signup_ts=datetime.datetime(2024, 4, 1, 12, 0)
try:
user = User.model_validate_strings(
{'id': '123', 'name': 'James', 'signup_ts': '2024-04-01'}, strict=True
)
except ValidationError as e:
print(e)
"""
1 validation error for User
signup_ts
Input should be a valid datetime, invalid datetime separator, expected `T`, `t`, `_` or space [type=datetime_parsing, input_value='2024-04-01', input_type=str]
"""

The model instance can be serialized using the model_dump method:

assert user.model_dump() == {'id': 1, 'name': 'David'}

The .model_dump_json() method serializes a model directly to a JSON-encoded string that is equivalent to the result produced by .model_dump().

from datetime import datetime
from pydantic import BaseModel
class BarModel(BaseModel):
whatever: int
class FooBarModel(BaseModel):
foo: datetime
bar: BarModel
m = FooBarModel(foo=datetime(2032, 6, 1, 12, 13, 14), bar={'whatever': 123})
print(m.model_dump_json())
#> {"foo":"2032-06-01T12:13:14","bar":{"whatever":123}}
print(m.model_dump_json(indent=2))
"""
{
"foo": "2032-06-01T12:13:14",
"bar": {
"whatever": 123
}
}
"""

Un model pot utilizar altres models.

Si tens aquest diagrama:

classDiagram
direction LR

class Order {
    id: int
}

class Client {
    id: int
    name: str
}

Order --> Client

Pots escriure aquest codi:

from pydantic import BaseModel
class Client(BaseModel):
id: int
name: str
class Order(BaseModel):
id: int
client: Client
data = {"id": 1, "client": {"id": 45, "name": "David"}}
order: Order = Order.model_validate(data)
assert order.client.id == 45

Genera les classes corresponents a aquest diagrama:

classDiagram
direction LR

class Order {
    id: int
}

class Client {
    id: int
    name: str
}

class Product {
    id: int
    name: str
    price: float

}

class OrderItem {
    quantity: int
}

Order --> Client
Order --> "1.**" OrderItem
OrderItem --> Product

{% sol %} TODO {% endsol %}

Crea un objecte Order a partir d’un dict:

{% sol %} TODO {% endsol %}

The Field function is used to customize and add metadata to fields of models.

There are some keyword arguments that can be used to constrain numeric values:

  • gt - greater than
  • lt - less than
  • ge - greater than or equal to
  • le - less than or equal to
  • multiple_of - a multiple of the given number
  • allow_inf_nan -allow 'inf', '-inf', 'nan' values

Here’s an example:

from pydantic import BaseModel, Field
class Foo(BaseModel):
positive: int = Field(gt=0)
non_negative: int = Field(ge=0)
negative: int = Field(lt=0)
non_positive: int = Field(le=0)
even: int = Field(multiple_of=2)
love_for_pydantic: float = Field(allow_inf_nan=True)
foo = Foo(
positive=1,
non_negative=0,
negative=-1,
non_positive=0,
even=2,
love_for_pydantic=float('inf'),
)
print(foo)
"""
positive=1 non_negative=0 negative=-1 non_positive=0 even=2 love_for_pydantic=inf
"""

There are fields that can be used to constrain strings:

  • min_length: Minimum length of the string.
  • max_length: Maximum length of the string.
  • pattern: A regular expression that the string must match.

Here’s an example:

from pydantic import BaseModel, Field
class Foo(BaseModel):
short: str = Field(min_length=3)
long: str = Field(max_length=10)
regex: str = Field(pattern=r'^\d*$')
foo = Foo(short='foo', long='foobarbaz', regex='123')
print(foo)
#> short='foo' long='foobarbaz' regex='123'

The parameter frozen is used to emulate the frozen dataclass behaviour. It is used to prevent the field from being assigned a new value after the model is created (immutability).

from pydantic import BaseModel, Field, ValidationError
class User(BaseModel):
name: str = Field(frozen=True)
age: int
user = User(name='John', age=42)
try:
user.name = 'Jane'
except ValidationError as e:
print(e)
"""
1 validation error for User
name
Field is frozen [type=frozen_field, input_value='Jane', input_type=str]
"""

Més informació a https://docs.pydantic.dev/latest/concepts/fields/

Amb pydatic pots consumir dades JSON.

En aquest exemple, demanes que la validació sigui estricta:

Pydantic provides builtin JSON parsing, which helps achieve:

from datetime import date
from typing import Tuple
from pydantic import BaseModel, ConfigDict, ValidationError
class Event(BaseModel):
model_config = ConfigDict(strict=True)
when: date
where: Tuple[int, int]
data: str = '{"when": "1987-01-28", "where": [51, -1]}'
event: Event = Event.model_validate_json(data)
assert event.where[0] == 51

https://docs.pydantic.dev/latest/concepts/json/


El contingut d'aquest lloc web té llicència CC BY-NC-ND 4.0.

©2022-2025 xtec.dev