Skip to content

[MINOR] Annotated decimal type #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Nov 17, 2023
Merged

[MINOR] Annotated decimal type #58

merged 15 commits into from
Nov 17, 2023

Conversation

faph
Copy link
Contributor

@faph faph commented Nov 5, 2023

Would fix #57

@faph faph self-assigned this Nov 5, 2023
@faph faph marked this pull request as draft November 5, 2023 22:02
@faph faph changed the title [WIP] Annotated decimal type [MINOR] Annotated decimal type Nov 6, 2023
@faph faph marked this pull request as ready for review November 6, 2023 00:45
@faph faph marked this pull request as draft November 6, 2023 01:44
def test_annotated_decimal_tuple_wrong_type():
py_type = Annotated[decimal.Decimal, ("a", 1)]
with pytest.raises(pas.TypeNotSupportedError):
assert_schema(py_type, {})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have tests with all supported data classes to validate the type works?

So one for dataclasses and one for pydantic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would only be necessary for the default value. The type itself can be tested without a data structure around it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm looks like it doesn't work like I though it would. this throws an error

from decimal import Decimal
from typing import Annotated

import py_avro_schema as pas
from pydantic import BeforeValidator, BaseModel


class Foo(BaseModel):
    bar: Annotated[Decimal, (3, 2), BeforeValidator(lambda x: x)]


if __name__ == "__main__":
    print(Foo(bar=Decimal("1.23")))
    print(pas.generate(Foo))

Pydantic works, but pas.generate throws

py_avro_schema._schemas.TypeNotSupportedError: Cannot generate Avro schema for Python type <class 'decimal.Decimal'>

What am I doing wrong?

Copy link

@dada-engineer dada-engineer Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@faph this still throws an error with the new DecimalMeta class used. Looks like py_type is decimal, and origin is none: what do i do wrong? 🤔

from decimal import Decimal
from typing import Annotated

import py_avro_schema as pas
from pydantic import Field, BaseModel


class Foo(BaseModel):
    bar: Annotated[
        Decimal, pas.DecimalMeta(3, 2), Field(max_digits=3, decimal_places=2)
    ]


if __name__ == "__main__":
    print(Foo(bar=Decimal("1.23")))
    print(pas.generate(Foo))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because of this PydanticSchema code

# Pydantic 2 resolves forward references for us. To avoid infinite recursion, we check if the unresolved raw
        # annoation is a forward reference. If so, we use that instead of Pydantic's resolved type hint. There might be
        # a better way to un-resolve the forward reference...
        # This does not support forward references from base model classes. If required we might need to traverse class
        # hierarchy to get the raw annotations?
        if isinstance(self.raw_annotations.get(name), (str, ForwardRef)):
            py_type = self.raw_annotations[name]
        else:
            py_type = py_field.annotation

py_field.annotation is only decimal. The raw_annotations contains the fully annotated type definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here @dada-engineer : f8ced26

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering whether this hashable requirement is not documented here: https://docs.python.org/3/library/typing.html?highlight=typing#typing.Annotated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for my Usecase @faph

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised the documentation issue here: python/cpython#112281

"""

@classmethod
def handles_type(cls, py_type: Type) -> bool:
"""Whether this schema class can represent a given Python class"""
try:
cls._size(py_type) # If we can find precision and scale, we are good

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the avro spec really require this? Can't one use decimals without precision / scale in avro? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, let's check.

I was going on the basis that to fully define a decimal, mathematically, you will need a value for scale and precision. But maybe there are some implicit defaults, e.g. zero scale.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://avro.apache.org/docs/1.11.1/specification/#decimal

"scale, a JSON integer representing the scale (optional). If not specified the scale is 0."

"Precision must be a positive integer greater than zero"

There are some more restrictions.

The question though is whether there is a clean way to omit the scale from the Python (annotated) type hint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options:

  1. Annotated[decimal.Decimal, DecimalMeta(precision=4, scale=2) where DecimalMeta is a dataclass which defaults scale to zero if not specified.
  2. Annotated[decimal.Decimal, (4, 2)], using Annotated[decimal.Decimal, (4, None)] to default scale to zero.

Other options?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like the first option as it would allow to set it as any arg and not strictly the second arg of Annotated. It's easier to iterate through and find exactly the arg needed. Seems reasonable as more and more libs use this mechanism.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, good point about the arg position

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only downside is that it will again require a py-avro-schema specific "type". That might simply the price to pay for a robust type hint. I couldnt find anyone else having defined a decimal meta type struct...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An option would be to get this from field info of the class itself using dataclasse.Field.metadata (py-avro-schema can choose how this metadata looks it is specifically for third party extension) and in pydantic from FieldInfo of a decimal. If there is non set some sane default?

@faph faph marked this pull request as ready for review November 16, 2023 12:20
@faph faph requested a review from jcameron73 November 16, 2023 12:21
@faph faph merged commit 04a88e5 into main Nov 17, 2023
@faph faph deleted the feature/annotated-decimal branch November 17, 2023 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[QUESTION]: DecimalType assignment type linter error
3 participants