Model subclass instances #492

samuelcolvin · 2023-03-29T11:49:04Z

TODO:

make sure revalidate works without from_attributes
do we need to remove warnings from model serialization on extra fields? - turns out I already thought about this!
implement the same logic on dataclasses
I think this now means we're calling __instancecheck__ in python on every validation, that's going to be super slow, we might need our own fast isinstance check, this in turn will need to work with the nascent PydanticGenericAlias - this only seems to add ~10%, I think fine

This allows both instances of the model, and instances of subclasses of the model to be validated.

In strict mode, only instances of the exact model type are allowed - this is up for discussion.

In both strict and lax mode, regardless of whether we have an exact instance or subclass instance, the validate_instances config flag (can also be set on the validator directly) forces fields to be revalidated.

When re-validating, __fields_set__ from the original instance is reused.

Leaking private data and "the FastAPI problem"

A big problem FastAPI had is was that allowing subclasses meant private information on subclasses was being leaked during serialisation.

E.g.:

from pydantic import BaseModel


class DataModel(BaseModel):
    name: str
    age: int


class ResponseModel(BaseModel):
    data: DataModel


class PrivateDataModel(DataModel):
    massive_secret: str


def my_view():
    data = PrivateDataModel(
        name='John',
        age=30,
        massive_secret=(
            'generative AI is good at inferring meaning and bad a '
            'calculating an answer LLMs are over-hyped'
        ),
    )
    response = ResponseModel(data=data)
    return response.json(indent=2)


print(my_view())
# output includes "massive_secret" in V1, but not in V2

However this is not a problem with Pydantic V2 since the serializer build for ResponseModel uses DataModel to build the serialization logic for .data, and only fields of DataModel are included.

You can run the above on pydantic main now and it works correctly (you'll need to install pydantic-core from this branch, and change .json to .model_dump_json, but otherwise it'll work.

@tiangolo please confirm you're happy with this approach.

codspeed-hq · 2023-03-29T11:53:43Z

CodSpeed Performance Report

Merging #492 model-subclass-instances (865d1fb) will not alter performances.

Summary

🔥 0 improvements
❌ 0 regressions
✅ 93 untouched benchmarks

🆕 2 new benchmarks
⁉️ 0 dropped benchmarks

Benchmarks breakdown

	Benchmark	`main`	`model-subclass-instances`	Change
🆕	`test_model_instance`	N/A	47.2 µs	N/A
🆕	`test_model_instance_abc`	N/A	48.3 µs	N/A

codecov-commenter · 2023-03-29T12:39:40Z

Codecov Report

Merging #492 (64d2efb) into main (12f596d) will decrease coverage by 0.09%.
The diff coverage is 87.77%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #492      +/-   ##
==========================================
- Coverage   94.73%   94.65%   -0.09%     
==========================================
  Files          93       93              
  Lines       11665    11686      +21     
  Branches       25       25              
==========================================
+ Hits        11051    11061      +10     
- Misses        607      618      +11     
  Partials        7        7

Impacted Files	Coverage Δ
src/input/input_abstract.rs	`87.62% <33.33%> (ø)`
src/validators/dataclass.rs	`95.69% <81.81%> (-1.46%)`	⬇️
src/validators/model.rs	`98.61% <97.56%> (+1.55%)`	⬆️
pydantic_core/core_schema.py	`96.72% <100.00%> (+<0.01%)`	⬆️
src/input/input_python.rs	`98.26% <100.00%> (-0.01%)`	⬇️

... and 1 file with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12f596d...64d2efb. Read the comment docs.

tiangolo · 2023-03-29T20:02:08Z

I'm happy with this approach! Thanks for pinging and tagging me! 🙇

More details:

I was having a small doubt I wanted to confirm on the laptop, thinking about non-model fields. E.g. list[X], and how serialization would work in that case for a non-model.

But I realized, first, I would probably take whatever is declared in the FastAPI path operation function as a Pydantic field, and I'm gonna be the one that creates ResponseModel with that field behind the scenes (which is what I already do in v1).

And then I double checked/confirmed it worked as expected by slightly modifying the example to use a list[DataModel] and it worked, so I think this is perfect. This will allow me to remove a lot of (potentially fragile) workaround cruft in FastAPI and definitely improve performance on that side. 🚀 🎉

Thanks again for doing this! 🙇

Modified example with list[DataModel] for completeness:

from pydantic import BaseModel


class DataModel(BaseModel):
    name: str
    age: int


class ResponseModel(BaseModel):
    data: list[DataModel]


class PrivateDataModel(DataModel):
    massive_secret: str


def my_view():
    data = PrivateDataModel(
        name='John',
        age=30,
        massive_secret=(
            'generative AI is good at inferring meaning and bad a '
            'calculating an answer LLMs are over-hyped'
        ),
    )
    response = ResponseModel(data=[data])
    return response.model_dump_json(indent=2)


print(my_view())
# output includes "massive_secret" in V1, but not in V2

dmontagu · 2023-03-30T00:28:22Z

In both strict and lax mode, regardless of whether we have an exact instance or subclass instance, the validate_instances config flag (can also be set on the validator directly) forces fields to be revalidated.

As discussed elsewhere (but want to make sure it doesn't get lost) — I think that it is more likely to be useful to have strict mode allow subclass instances. (And I think it will definitely be an early feature request if we don't do that now.)

The reason I say this is that, over the years, many people requested that FastAPI be stricter about what is accepted during parsing (i.e., not coercing strings to int, etc.). But I think that is largely orthogonal to the behavior of how handling subclasses works, since that is really a pure-python-API consideration (you don't have class information while parsing, of course).

Considering that even with the strictest possible settings, type checkers such as mypy and pyright will not produce errors if you pass proper subclass instances to fields, I expect that, at least among the users it affects, it will be much more popular to allow subclass instances than not. (I could be convinced otherwise if someone could share a compelling use case for this, I haven't seen one though; you can use Final to make it a type error to subclass a class, and it just seems pretty contrived to me that you'd want to allow a class to be subclassed, but disallow subclass instances from being used in your pydantic models.)

I know I personally would have preferred both to have strict parsing of primitives and to allow subclass instances to pass validation from the python API (i.e., being allowed as inputs to __init__) in all of the FastAPI application codebases I have built over the past years.

I'll just add a brief note about invariance, to head off any potential attempts to draw an analogy with how mypy will give a type error for List[Subclass] if you try to pass it to a field annotated as type List[Class]:

The invariance of List means that List[Subclass] is not considered a subclass of List[Class]. It's still the case that subclasses are always allowed as instances of fields, it's just that instances of MyGeneric[Subclass] are allowed to be used as instances of MyGeneric[Class] precisely when MyGeneric is covariant (i.e., the TypeVar used to define it is marked as covariant). And List is not covariant.

samuelcolvin · 2023-03-30T11:30:45Z

Merged this by mistake.

@dmontagu your logic matches the "wisdom" of the twitter crowd, I'll change to match that behaviour.

samuelcolvin requested review from adriangb, dmontagu and tiangolo March 29, 2023 11:53

samuelcolvin mentioned this pull request Mar 30, 2023

remove calls to __instancecheck__ on ModelValidator #497

Open

samuelcolvin added 5 commits March 30, 2023 11:08

Support model subclass instances

4d44183

fix models to JSON

c373384

support dataclasses

655824b

no need for from_attributes on model subclass

8755513

add abc to model_instance tests

f6ea4c9

samuelcolvin force-pushed the model-subclass-instances branch from 082ccdd to f6ea4c9 Compare March 30, 2023 10:11

samuelcolvin added 3 commits March 30, 2023 11:26

simplify input get_attr logic

2f02ab5

tweak

64d2efb

tweak serialization

865d1fb

samuelcolvin enabled auto-merge (squash) March 30, 2023 11:23

samuelcolvin merged commit 956a235 into main Mar 30, 2023

samuelcolvin deleted the model-subclass-instances branch March 30, 2023 11:27

samuelcolvin mentioned this pull request Mar 30, 2023

Allow instances of subclasses in strict mode #498

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model subclass instances #492

Model subclass instances #492

Uh oh!

samuelcolvin commented Mar 29, 2023 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 29, 2023 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 29, 2023 •

edited

Loading

Uh oh!

tiangolo commented Mar 29, 2023

Uh oh!

dmontagu commented Mar 30, 2023 •

edited

Loading

Uh oh!

samuelcolvin commented Mar 30, 2023

Uh oh!

Uh oh!

Model subclass instances #492

Model subclass instances #492

Uh oh!

Conversation

samuelcolvin commented Mar 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Leaking private data and "the FastAPI problem"

Uh oh!

codspeed-hq bot commented Mar 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Summary

Benchmarks breakdown

Uh oh!

codecov-commenter commented Mar 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tiangolo commented Mar 29, 2023

Uh oh!

dmontagu commented Mar 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samuelcolvin commented Mar 30, 2023

Uh oh!

Uh oh!

samuelcolvin commented Mar 29, 2023 •

edited

Loading

codspeed-hq bot commented Mar 29, 2023 •

edited

Loading

codecov-commenter commented Mar 29, 2023 •

edited

Loading

dmontagu commented Mar 30, 2023 •

edited

Loading