-
-
Notifications
You must be signed in to change notification settings - Fork 77
bugfix: fixed incorrect bytestring encoding PlutusData #269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
9492b5c
c92ecfb
6dbdd4f
b663a3b
364ba67
5a90438
c71dd82
945e52a
81abc6a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -56,6 +56,13 @@ def __init__(self, li: Primitive): # type: ignore | |
super().__init__(li) # type: ignore | ||
|
||
|
||
class MetadataIndefiniteList(UserList): | ||
"""Dummy class to catch special requirements for PlutusData encoding.""" | ||
|
||
def __init__(self, li: Primitive): # type: ignore | ||
super().__init__(li) # type: ignore | ||
|
||
|
||
class IndefiniteFrozenList(FrozenList, IndefiniteList): # type: ignore | ||
pass | ||
|
||
|
@@ -166,17 +173,31 @@ def default_encoder( | |
FrozenList, | ||
IndefiniteFrozenList, | ||
frozendict, | ||
MetadataIndefiniteList, | ||
), | ||
), ( | ||
f"Type of input value is not CBORSerializable, " f"got {type(value)} instead." | ||
) | ||
if isinstance(value, (IndefiniteList, IndefiniteFrozenList)): | ||
if isinstance( | ||
value, (IndefiniteList, IndefiniteFrozenList, MetadataIndefiniteList) | ||
): | ||
# Currently, cbor2 doesn't support indefinite list, therefore we need special | ||
# handling here to explicitly write header (b'\x9f'), each body item, and footer (b'\xff') to | ||
# the output bytestring. | ||
encoder.write(b"\x9f") | ||
for item in value: | ||
encoder.encode(item) | ||
if ( | ||
isinstance(value, MetadataIndefiniteList) | ||
and isinstance(item, bytes) | ||
and len(item) > 64 | ||
): | ||
encoder.write(b"\x5f") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is only activated when an item is inside a indefinite list. Do we need to break byte strings that are not part of indefinite list? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AFAIK we need to break all bytes that are longer than 64 bytes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I may have misunderstood, but it seemed to me that this was the best place to put it since all If I pull it out of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess you (correctly) noticed that all PlutusData fields are part of an indefinite list. However plutusdata can also contain bytes without being part of PlutusData (i.e. pure bytes or bytes that are keys in dictionaries) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So is the final answer to pull it outside of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this documentation it seems that yes, we need dummy classes. But not for lists, for bytes! :) I am also wondering if there are cases where integers are incorrectly encoded (when they exceed 64 bytes size) since I implemented a special case for this here: https://github.com/OpShin/uplc/blob/448f634cc1225de6dd7390b670b01396d2e71156/uplc/ast.py#L430 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess I am seeing more and more the intuition behind all the custom classes in OpShin. I realize it's a bigger lift, but is there any reason why we wouldn't just take OpShin's implementation and pull it over to here? Then, just rely on pycardano rather than duplicating efforts across repos? I apologize if I'm speaking out of ignorance and there are things I'm not considering, but this seems like it might be the more lasting implementation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No worries at all. The code I wrote for OpShin/UPLC was created after pycardano was written, hence there might be a point in copying it over. Then again, the UPLC implementation is really only catered towards PlutusData, while PyCardano also handles serialization of all other kinds of things - not sure if anything will break. Long story short: The only reason that there are two different implementations is that no one yet tried to unify them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, I would like to have this done sooner rather than later. Can I just create a dummy class for bytes to patch this and open a more general issue about syncing datum handling between OpShin and pycardano? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes sounds good to me! Would also prefer to get this resolved over any big open stale PR :) |
||
for i in range(0, len(item), 64): | ||
imax = min(i + 64, len(item)) | ||
encoder.encode(item[i:imax]) | ||
encoder.write(b"\xff") | ||
else: | ||
encoder.encode(item) | ||
encoder.write(b"\xff") | ||
elif isinstance(value, RawCBOR): | ||
encoder.write(value.cbor) | ||
|
@@ -511,6 +532,13 @@ def _restore_typed_primitive( | |
return IndefiniteList(v) | ||
except TypeError: | ||
raise DeserializeException(f"Can not initialize IndefiniteList from {v}") | ||
elif isclass(t) and issubclass(t, MetadataIndefiniteList): | ||
try: | ||
return MetadataIndefiniteList(v) | ||
except TypeError: | ||
raise DeserializeException( | ||
f"Can not initialize MetadataIndefiniteList from {v}" | ||
) | ||
elif hasattr(t, "__origin__") and (t.__origin__ is dict): | ||
t_args = t.__args__ | ||
if len(t_args) != 2: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why the encoding as MetadatIndefiniteList is necessary, can we not simply encode all bytes longer than 64 bytes as a list?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I'm still new to Cardano. I guess these were my thoughts and how I would push back on encoding all bytes the same way.
If you feel these are non-issues, I think it's easy enough to remove the dummy class and encode everything the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in retrospect your comments on the issue I raised make more sense now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC the cardano ledger generally specifies that the cbor encoding of bytestrings should be at most 64 bytes long each piece. This would prevent OOM attacks when reading very long bytestrings. However the ledger does not enforce this, leading to different implementations being abound on chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But isn't OOM attack prevented my maximum transaction size anyway? Again, just playing devils advocate here. I'm still relatively new to Cardano.
I can revert changes back to the original without the dummy class for
PlutusData
. Just give me a yay or nay. With these changes I was able to successfully submit to smart contracts, so I know these changes work correctly.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to bump this so I can finish it off :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer this without the dummy class - maybe you can revert to that and see if both the test cases and your submission pass?
then Jerry or I can create a test case for exactly this datum submission