Pydantic valida i serialitza de manera automàtica les dades JSON que consumeixes o produeixes.
Introducció
Pydantic és una biblioteca de validació de dades que utilitza Typing
Instal.la la biblioteca pydantic
:
$ poetry add pydantic
Models
Un model és una classe que hereta de BaseModel
i anota amb tipus els atributs de la classe.
Són molt semblants a un @dataclass
, excepte que estan pensants per:
- La validació i serialització de dades JSON
- La generació d'esquemes JSON.
Per serialitzar dades, Pydantic utilitzar una llibreria escrita en Rust: jiter
A continuació tens un exemple d'una classe User
which inherit from BaseModel
and define fields as annotated attributes:
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str | None = None
The model can then be instantiated:
user: User = User(id=1, name="David")
Initialization of the object will perform all parsing and validation.
If no ValidationError
exception is raised, you know the resulting model instance is valid:
assert user.id == 1
assert user.name == "David"
Però si escrius aquest codi, mypy
et dirà que es erroni:
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str | None = None
david: User = User(name="David")
I pydantic genera un error en temps d'execució
> python test.py
...
pydantic_core._pydantic_core.ValidationError: 1 validation error for User
id
Field required [type=missing, input_value={'name': 'David'}, input_type=dict]
A single exception will be raised regardless of the number of errors found, and that validation error will contain information about all of the errors and how they happened.
By default, models are mutable and field values can be changed through attribute assignment:
user.id = 321
assert user.id == 321
Validating data
Pydantic utilitza un dict
per guardar les dades: podem passar directament un "punter" a un dict
per crear un User
.
Si crees objectes a partir de dades de sistemes externs, no hi ha cap garantia de que siguin correctes:
from pydantic import BaseModel
from typing import Any
class User(BaseModel):
id: int
name: str | None = None
data: Any = {"id": 1, "name": "David"}
User(**data)
data = {"name": "apple", "price": 3}
User(**data) # Error de validació
Pydantic provides three methods on models classes for parsing data:
1.- model_validate()
This is very similar to the __init__
method of the model, except it takes a dictionary or an object rather than keyword arguments. If the object passed cannot be validated, or if it's not a dictionary or instance of the model in question, a ValidationError
will be raised.
from datetime import datetime
from pydantic import BaseModel, ValidationError
class User(BaseModel):
id: int
name: str = 'John Doe'
signup_ts: datetime | None = None
user = User.model_validate({'id': 123, 'name': 'James'})
print(user)
#> id=123 name='James' signup_ts=None
try:
User.model_validate(['not', 'a', 'dict'])
except ValidationError as e:
print(e)
"""
1 validation error for User
Input should be a valid dictionary or instance of User [type=model_type, input_value=['not', 'a', 'dict'], input_type=list]
"""
2.- model_validate_json()
This validates the provided data as a JSON string or bytes
object. If your incoming data is a JSON payload, this is generally considered faster (instead of manually parsing the data as a dictionary).
user = User.model_validate_json('{"id": 123, "name": "James"}')
print(user)
#> id=123 name='James' signup_ts=None
try:
user = User.model_validate_json('{"id": 123, "name": 123}')
except ValidationError as e:
print(e)
"""
1 validation error for User
name
Input should be a valid string [type=string_type, input_value=123, input_type=int]
"""
try:
user = User.model_validate_json('invalid JSON')
except ValidationError as e:
print(e)
"""
1 validation error for User
Invalid JSON: expected value at line 1 column 1 [type=json_invalid, input_value='invalid JSON', input_type=str]
"""
3.- model_validate_strings()
This takes a dictionary (can be nested) with string keys and values and validates the data in JSON mode so that said strings can be coerced into the correct types.
user = User.model_validate_strings({'id': '123', 'name': 'James'})
print(user)
#> id=123 name='James' signup_ts=None
user = User.model_validate_strings(
{'id': '123', 'name': 'James', 'signup_ts': '2024-04-01T12:00:00'}
)
print(user)
#> id=123 name='James' signup_ts=datetime.datetime(2024, 4, 1, 12, 0)
try:
user = User.model_validate_strings(
{'id': '123', 'name': 'James', 'signup_ts': '2024-04-01'}, strict=True
)
except ValidationError as e:
print(e)
"""
1 validation error for User
signup_ts
Input should be a valid datetime, invalid datetime separator, expected `T`, `t`, `_` or space [type=datetime_parsing, input_value='2024-04-01', input_type=str]
"""
Serialització
The model instance can be serialized using the model_dump
method:
assert user.model_dump() == {'id': 1, 'name': 'David'}
The .model_dump_json()
method serializes a model directly to a JSON-encoded string that is equivalent to the result produced by .model_dump()
.
from datetime import datetime
from pydantic import BaseModel
class BarModel(BaseModel):
whatever: int
class FooBarModel(BaseModel):
foo: datetime
bar: BarModel
m = FooBarModel(foo=datetime(2032, 6, 1, 12, 13, 14), bar={'whatever': 123})
print(m.model_dump_json())
#> {"foo":"2032-06-01T12:13:14","bar":{"whatever":123}}
print(m.model_dump_json(indent=2))
"""
{
"foo": "2032-06-01T12:13:14",
"bar": {
"whatever": 123
}
}
"""
Nested models
Un model pot utilizar altres models.
Si tens aquest diagrama:
classDiagram direction LR class Order { id: int } class Client { id: int name: str } Order --> Client
Pots escriure aquest codi:
from pydantic import BaseModel
class Client(BaseModel):
id: int
name: str
class Order(BaseModel):
id: int
client: Client
data = {"id": 1, "client": {"id": 45, "name": "David"}}
order: Order = Order.model_validate(data)
assert order.client.id == 45
Activitat
Genera les classes corresponents a aquest diagrama:
classDiagram direction LR class Order { id: int } class Client { id: int name: str } class Product { id: int name: str price: float } class OrderItem { quantity: int } Order --> Client Order --> "1.**" OrderItem OrderItem --> Product
TODO
Crea un objecte Order
a partir d'un dict
:
TODO
Field
The Field
function is used to customize and add metadata to fields of models.
Numeric Constraints
There are some keyword arguments that can be used to constrain numeric values:
gt
- greater thanlt
- less thange
- greater than or equal tole
- less than or equal tomultiple_of
- a multiple of the given numberallow_inf_nan
-allow'inf'
,'-inf'
,'nan'
values
Here's an example:
from pydantic import BaseModel, Field
class Foo(BaseModel):
positive: int = Field(gt=0)
non_negative: int = Field(ge=0)
negative: int = Field(lt=0)
non_positive: int = Field(le=0)
even: int = Field(multiple_of=2)
love_for_pydantic: float = Field(allow_inf_nan=True)
foo = Foo(
positive=1,
non_negative=0,
negative=-1,
non_positive=0,
even=2,
love_for_pydantic=float('inf'),
)
print(foo)
"""
positive=1 non_negative=0 negative=-1 non_positive=0 even=2 love_for_pydantic=inf
"""
String Constraints
There are fields that can be used to constrain strings:
min_length
: Minimum length of the string.max_length
: Maximum length of the string.pattern
: A regular expression that the string must match.
Here's an example:
from pydantic import BaseModel, Field
class Foo(BaseModel):
short: str = Field(min_length=3)
long: str = Field(max_length=10)
regex: str = Field(pattern=r'^\d*$')
foo = Foo(short='foo', long='foobarbaz', regex='123')
print(foo)
#> short='foo' long='foobarbaz' regex='123'
Immutability
The parameter frozen
is used to emulate the frozen dataclass behaviour. It is used to prevent the field from being assigned a new value after the model is created (immutability).
from pydantic import BaseModel, Field, ValidationError
class User(BaseModel):
name: str = Field(frozen=True)
age: int
user = User(name='John', age=42)
try:
user.name = 'Jane'
except ValidationError as e:
print(e)
"""
1 validation error for User
name
Field is frozen [type=frozen_field, input_value='Jane', input_type=str]
"""
Més informació a https://docs.pydantic.dev/latest/concepts/fields/
JSON
Parsing
Amb pydatic pots consumir dades JSON.
En aquest exemple, demanes que la validació sigui estricta:
Pydantic provides builtin JSON parsing, which helps achieve:
from datetime import date
from typing import Tuple
from pydantic import BaseModel, ConfigDict, ValidationError
class Event(BaseModel):
model_config = ConfigDict(strict=True)
when: date
where: Tuple[int, int]
data: str = '{"when": "1987-01-28", "where": [51, -1]}'
event: Event = Event.model_validate_json(data)
assert event.where[0] == 51
https://docs.pydantic.dev/latest/concepts/json/