Skip to content

Add rich tuple decoder #1353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 23 commits into from
Closed

Conversation

banteg
Copy link
Contributor

@banteg banteg commented May 15, 2019

What was wrong?

Rich objects can already be passed where tuples are expected as input, but there was no way to decode outputs as named objects even though their component names are usually available in the ABI.

Related to Issue #1267

How was it fixed?

There is a new decode option available when instantiating a contract which applies both to calls and contract caller. When enabled, all tuples/structs are decoded as namedtuples. They are kept compatible with the current API and should work everywhere a tuple is expected.

This works via web3._utils.abi.named_tree(abi, data) which decodes function inputs/outputs provided their ABIs. Internally it works with dicts as tuples have multiple limitations on how their fields can be named.

There are also two customized namedtuple factories:

  • foldable_namedtuple(fields) which returns subclass of namedtuple which can be instantiated via a single argument, such that type(x)(x) == x. This is done to keep it compatible with _align_abi_input.
  • Tuple(**kwargs) which works with literal representation, such that Tuple(x=1, y=2) returns Tuple(x=1, y=2). This is useful when copying the output so it can be loaded back easily.

This pull request also fixes tuple support for event parser.

Example

In [1]: from web3._utils.abi import named_tree, dict_to_namedtuple
In [2]: from tests.core.utilities.test_abi import TEST_FUNCTION_ABI
In [3]: from tests.core.utilities.test_abi import TEST_FUNCTION_ABI, GET_ABI_INPUTS_OUTPUT
In [4]: abi = TEST_FUNCTION_ABI['inputs']
In [5]: inputs = GET_ABI_INPUTS_OUTPUT[1]
In [6]: inputs
Out[6]: ((1, [2, 3, 4], [(5, 6), (7, 8), (9, 10)]), (11, 12), 13)
In [7]: decoded = named_tree(abi, inputs)
In [8]: decoded
Out[8]:
{'s': {'a': 1,
  'b': [2, 3, 4],
  'c': [{'x': 5, 'y': 6}, {'x': 7, 'y': 8}, {'x': 9, 'y': 10}]},
 't': {'x': 11, 'y': 12},
 'a': 13}
In [9]: dict_to_namedtuple(decoded)
Out[9]: Tuple(s=Tuple(a=1, b=[2, 3, 4], c=[Tuple(x=5, y=6), Tuple(x=7, y=8), Tuple(x=9, y=10)]), t=Tuple(x=11, y=12), a=13)

Cute Animal Picture

IMG_7703

@banteg banteg force-pushed the rich-tuple-decoder branch from ec59b89 to df21e01 Compare May 15, 2019 00:53
@banteg banteg force-pushed the rich-tuple-decoder branch from df21e01 to 7e24a3a Compare May 15, 2019 00:55
@banteg banteg force-pushed the rich-tuple-decoder branch from d3de612 to 1bdd964 Compare May 15, 2019 03:33
@banteg
Copy link
Contributor Author

banteg commented May 15, 2019

I just found out that decode_function_input was broken for tuple arguments. I fixed it as a part of this PR as the fix depends on the new functionality.

@banteg banteg force-pushed the rich-tuple-decoder branch from 3a039a0 to 488a194 Compare May 15, 2019 07:56
@kclowes
Copy link
Collaborator

kclowes commented May 15, 2019

Awesome @banteg! I still need to dig into this PR a little more in depth, but on a high level, I'd like to see some more testing around the named_data_tree (for example, what if a name is empty in the ABI?). Additionally, I think it makes sense to return a namedtuple instead of a dictionary since returning a dictionary will be a breaking change. Thanks for adding this!

@banteg
Copy link
Contributor Author

banteg commented May 16, 2019

Thanks for the feedback. I agree this needs more testing. I initially went with dict because this format is accepted as input kwargs, so I will need to dig into it a bit more to see if namedtuples would work as args, but that change makes sense. One function's output can be another function's input, so I think it's important to keep the parsed versions compatible.

@banteg banteg changed the title Add rich tuple decoder [WIP] Add rich tuple decoder May 16, 2019
@banteg
Copy link
Contributor Author

banteg commented May 16, 2019

Current progress with namedtuples:

# s = (a=1, b=[2, 3, 4], c=[(x=5, y=6), (x=7, y=8), (x=9, y=10)])
# t = (x=11, y=12)
# a = 13
inputs = [
    (1, [2, 3, 4], [(5, 6), (7, 8), (9, 10)]),
    (11, 12),
    13,
]
> result = [named_data_tree(*item) for item in zip(abi, inputs)]
> result
[(a=1, b=[2, 3, 4], c=[(x=5, y=6), (x=7, y=8), (x=9, y=10)]),
 (x=11, y=12),
 13]
> inputs == result
True

Now I need to figure out some quirks to make it work with _align_abi_input.

@banteg banteg changed the title [WIP] Add rich tuple decoder Add rich tuple decoder May 16, 2019
@banteg
Copy link
Contributor Author

banteg commented May 16, 2019

I've been playing with different ways to present an anon tuple and landed on (x=1, y=2). If only Python supported literal namedtuples.

This is how example from the first message now looks like:

> market = dydx.getMarket(1)
(token='0x89d24A6b4CcB1B6fAA2625fE562bDD9a23260359', totalPar=(borrow=1284671939554912048936288, supply=1999868739240631893521211), index=(borrow=1014323761632160906, supply=1008162740160717992, lastUpdate=1558020975), priceOracle='0x787F552BDC17332c98aA360748884513e3cB401a', interestSetter='0xad91a0ddf799176a0A87a32Dafe8F3dd28479918', marginPremium=(value=0), spreadPremium=(value=0), isClosing=False)
> market.totalPar
(borrow=1284671939554912048936288, supply=1999868739240631893521211)

These tuples now should work everywhere a tuple is expected. I've also added some tests, although not comprehensive.

The thing I dislike is that I can't just copy this representation and paste into another terminal. Maybe it can be alleviated with some clever factory that converts kwargs into tuples, but we won't have anonymous tuple representation then.

decode_function_input has moved to utilities as decode_transaction_data alongside encode_transaction_data. Now that it returns tuples, it has broke some tests, I'll take a further look tomorrow.

@banteg banteg force-pushed the rich-tuple-decoder branch from 04610a4 to ade0669 Compare May 16, 2019 17:14
@banteg
Copy link
Contributor Author

banteg commented May 16, 2019

NB: Tests fail due to issues unrelated to this PR.

@kclowes
Copy link
Collaborator

kclowes commented May 16, 2019

Sweet, thanks! I reran the tests, so hopefully those will fix themselves. I'll take a look at the code tomorrow!

@banteg
Copy link
Contributor Author

banteg commented May 17, 2019

One idea to think about. If we define these two constructors, we can have copyable literal namedtuples which would require only one import to work. I think this feature is more important than concise anonymous representation. Also, should we call it Tuple or Struct?

def foldable_namedtuple(fields):
    class Tuple(namedtuple('Tuple', fields)):
        def __new__(self, args):
            return super().__new__(self, *args)

    return Tuple

def Tuple(**kwargs):
    keys, values = zip(*kwargs.items())
    return foldable_namedtuple(keys)(values)

For example:

> Tuple(a=1, b=[2, 3, 4], c=[Tuple(x=5, y=6), Tuple(x=7, y=8), Tuple(x=9, y=10)])
  Tuple(a=1, b=[2, 3, 4], c=[Tuple(x=5, y=6), Tuple(x=7, y=8), Tuple(x=9, y=10)])

@banteg
Copy link
Contributor Author

banteg commented May 17, 2019

One problem with namedtuple I found: ValueError: Field names cannot start with an underscore. I think leading underscores are uncommon in Struct definitions, but they are pretty common in function inputs.

@kclowes
Copy link
Collaborator

kclowes commented May 17, 2019

Unfortunately, since v5 is in beta now, we can't make any more breaking changes (until v6 which is a ways out). We'll need to keep decode_function_input returning the same dict.

Should we call it Tuple or Struct?

I like Tuple. I also wouldn't be opposed to something like DecodedABITuple or something more specific than just Tuple.

One problem with namedtuple I found: ValueError: Field names cannot start with an underscore

A few options that come to mind off the top of my head are:

  • skip the conversion to a namedtuple, and just return the "unnamed" tuple, like it was doing before
  • use the namedtuple rename=True option, and then people can access the underscored names by the index
  • strip it like you're doing now

It seems to me like the easiest approach might be to just revert to returning a regular tuple for a couple reasons: 1) it feels unexpected to me to return a different name than the one that was provided in the ABI, and 2) what happens if we strip an underscore name which then becomes the same as a name that didn't originally have an underscore? Although, I can see where it would also be unexpected to have a regular tuple returned if you pass in the decode flag, so I don't have a strong opinion there. I think we'll just have to be clear in the documentation.

@pipermerriam I'd like to get your input here as well!

@banteg
Copy link
Contributor Author

banteg commented May 18, 2019

I think it would be enough to add this check after the stripping:

 def foldable_namedtuple(fields):
     fields = [field.lstrip('_') for field in fields]
+    if '' in fields or len(set(fields)) < len(fields):
+        return tuple

Example with fallback to tuple:

In [45]: foldable_namedtuple(['a', 'b'])([1, 2])
Out[45]: Tuple(a=1, b=2)

In [46]: foldable_namedtuple(['_a', '_b'])([1, 2])
Out[46]: Tuple(a=1, b=2)

In [47]: foldable_namedtuple(['_a', 'a'])([1, 2])
Out[47]: (1, 2)

In [48]: foldable_namedtuple(['_a', '_'])([1, 2])
Out[48]: (1, 2)

Example with rename=True option:

In [52]: foldable_namedtuple(['a', 'b'])([1, 2])
Out[52]: Tuple(a=1, b=2)

In [53]: foldable_namedtuple(['_a', '_b'])([1, 2])
Out[53]: Tuple(a=1, b=2)

In [54]: foldable_namedtuple(['_a', 'a'])([1, 2])
Out[54]: Tuple(a=1, _1=2)

In [55]: foldable_namedtuple(['_a', '_'])([1, 2])
Out[55]: Tuple(a=1, _1=2)

To me the latter example looks less predictable, so my vote is for reverting to a tuple on conflicting input.

As for decode_function_input, the previous version didn't work with tuples, I'd say let's keep the utility function for a future update and I'll update the contract method to return a dict with proper tuple decoding.

@banteg
Copy link
Contributor Author

banteg commented May 18, 2019

I've reverted the breaking change, now decode_function_input returns dict as before. It uses namedtuple decoder as an intermediate representation, so I've added an option as_dict=True to expose it.

@banteg
Copy link
Contributor Author

banteg commented May 18, 2019

Another issue with namedtuples: ValueError: Type names and field names cannot be a keyword: 'from'. Looks like my initial intuition about using dicts was correct. I'll try to play with inverted logic (tuple → dict → namedtuple) to see if it makes things less messy.

I also noticed that events completely lack tuple support, I'll try to address that too.

@banteg
Copy link
Contributor Author

banteg commented Jul 9, 2019

Vyper outputs scalars as tuple, so I addressed that to keep it compatible:

# before
In [2]: contract.caller().totalSupply()
Out[2]: Tuple(out=347980466501289403984)
# after
In [3]: uni_eth.caller().totalSupply()
Out[3]: 347980466501289403984

@banteg banteg force-pushed the rich-tuple-decoder branch from e29cd72 to e328c5b Compare July 9, 2019 12:55
@fselmo fselmo mentioned this pull request Oct 31, 2022
22 tasks
@pacrob pacrob mentioned this pull request Jan 27, 2023
1 task
@pacrob
Copy link
Contributor

pacrob commented Feb 14, 2023

closed by PR #2799. Thanks @banteg!

@pacrob pacrob closed this Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants