Skip to content

traits vs. traitlets vs. any other idea #1402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
oesteban opened this issue Mar 9, 2016 · 8 comments
Open

traits vs. traitlets vs. any other idea #1402

oesteban opened this issue Mar 9, 2016 · 8 comments

Comments

@oesteban
Copy link
Contributor

oesteban commented Mar 9, 2016

I am stuck with this problem in #1371. Particularly, in that PR I am facing the decision of 1) use more features of traits, making this dependency stronger; 2) make the move to traitlets for good; 3) consider any option somebody else can provide.

IMHO, traits/traitlets are the way to go. They solve nicely the problem of defining the inputs and outputs of the interfaces. They are intuitive (from the user point of view) and creating nipype interfaces has been proven relatively easy. So, if there is a poll on this (keep traits vs. moving to something else), I'd stick with the traits.

Then, the question of traits vs. traitlets. In my opinion, we are underexploiting some features of traits:

  • Features of both traits and traitlets we do not use:
    • Events: trait(let)s are observable and using callbacks would ease our lives a lot. Some scarce uses have been done. But, generally, only for simple tasks like validating a field or even shadowing existing features of traits like synchronization.
    • Synchronization: trait(let)s allow keeping traits synchronized
  • Features only in traits that we do not use:
    • Defining interfaces (somebody said interfaces?)
    • Mapped traits: these are convenient traits in constructs like this, in which you can append an underscore to the trait name to get the mapped name (in this case using interface.inputs.output_type_ would give you the current extension value, instead of the actual value of the trait.
    • Deferring and adaptation: I do not know why we would need this yet.

What we would get using these features:

  • For instance, the implementation of the IdentityInterface would be trivial: when creating the input and output fields, we would just need to synchronize them (using link in traitlets or sync in traits). Synchronization can be directed (i.e. from intputs to outputs only) in both trait(let)s
  • Something that interacts with @satra 's advances on the new workflow design: connections between interfaces (should) could be also implemented very easily with synchronization.
    • This would also make unnecessary some pieces of code that are making the interfaced workflow very tricky to implement. The logic under these _propagate_... functions is a bit cluttered and I have the impression that we just cover the mimimum set of situations.
    • Keeping the workflow synchronized with traits would remove this task from ourselves, and the pipeline engine will have two responsibilities: ordering the tasks and run them in that order. Which is a lot simpler.
    • Using traits.Interface makes the base code of interfaces a lot more readable and simple. In [WIP] Major refactor (nipype-1.5 ?) #1371, I proposed that interfaces, inputs and outputs are traits.Interfaces.

I am probably forgetting a lot of stuff, but I hope this is more than enough to make an informed decision. What are your thoughts?

@satra
Copy link
Member

satra commented Mar 21, 2016

@oesteban - ok i have had some time to think about this.

traits/traitlets is extremely useful in specifying the workflow and in validating inputs/outputs. other alternatives are avro/schema salad and protocol buffers.

cleanup: we did a lot of hackish things with traits, because dynamic traits did not pickle properly. it is perhaps time to revisit these elements in both the base traits class and in dynamic classes such as io.py and in mapnode.

traits.Interface: could you elaborate what this buys us? perhaps point to some specific code?

MapNode: one reason we have had to maintain a separation between interface traits and mapnode traits is that mapnodes allow multiple values for traits that can be mapped. so simply using traits will not allow us to solve this particular issue.

separation between specification and execution: trait synchronization is great for specification, but quite poor for execution, especially when the execution system uses graph submissions or runs on a cluster. i think the multiproc model allows for much more synchronization possibilities than any other plugin. so we should consider if it helps there.

workflow engine rewrite (i plan to share a draft notebook by end of this week):

  • as i rewrite the workflow engine, i will see how traits fit into the picture. we will lose the distinction between Nodes and MapNodes (although we may keep MapNodes as a special subclass.
  • i'm currently writing it with the notion that what a node runs can be a python function, nipype interface, a web service or any object that allows specification of an input, output, and how to run it.
  • i'm trying to therefore simplify what the engine takes care of. but it still needs to know things about scattering (iterables, iternode) and gathering (mapnode, joinnode).

bottom line is that from a specification/validation standpoint traits are great. i see no particular reason to drop them, but we can discuss whether going with traitlets or traits makes sense. now that traits is available in py3, it has been a lower priority in my head to walk away from them. my main concern with adopting traits fully is that it does a lot of magic under the hood. and they overwrite many standard pythonisms in implementing it, for example, super inside __init__ doesn't work in a normal order.

@oesteban
Copy link
Contributor Author

I see, thanks a lot for taking the time in elaborating this reply 👍

traits/traitlets is extremely useful in specifying the workflow and in validating inputs/outputs. other alternatives are avro/schema salad and protocol buffers.

If those are the options, I guess we stay with traits/traitlets ...

cleanup: we did a lot of hackish things with traits, because dynamic traits did not pickle properly. it is perhaps time to revisit these elements in both the base traits class and in dynamic classes such as io.py and in mapnode.

Yes, I guess that revising the use of wildcards and PrivateTraits will simplify all of this.

traits.Interface: could you elaborate what this buys us? perhaps point to some specific code?

As they say themselves this would not buy us much. But it is a very nice way to make our Interfaces more strict when checking inputs and outputs, and it is naturally integrated with traits. I have an example on how this code looks here https://github.com/oesteban/nipype/blob/enh/nipype-1.5/nipype/interfaces/base/interfaces.py#L62. The code is not very different.

The only potential new thing is when we want to use Interfaces containing interfaces. An example of this is here: https://github.com/oesteban/nipype/blob/enh/nipype-1.5/nipype/interfaces/base/interfaces.py#L140. Using that Instance we do not need the input_spec and output_spec anymore, and they are dealt as traits very conveniently (i.e. we can remove the code to make the traits accessible as attributes, because they are also traits). I think all the code about interfaces would be a lot cleaner (and easier to maintain). With no changes to the final user, btw.

MapNode: one reason we have had to maintain a separation between interface traits and mapnode traits is that mapnodes allow multiple values for traits that can be mapped. so simply using traits will not allow us to solve this particular issue.

I see. I will think a bit more about this issue because I can't give any idea right now.

separation between specification and execution: trait synchronization is great for specification, but quite poor for execution, especially when the execution system uses graph submissions or runs on a cluster. i think the multiproc model allows for much more synchronization possibilities than any other plugin. so we should consider if it helps there.

You are right, in graph submissions the trait synchronization is probably useless. For multiproc and linear it will be very useful though. I would need to take a look on how the inputs/outputs pass along under these graph plugins, but I bet that at least we should move to update callbacks on the output traits.

workflow engine rewrite (i plan to share a draft notebook by end of this week):

Could I have a sneak peek on it before?

as i rewrite the workflow engine, i will see how traits fit into the picture. we will lose the distinction between Nodes and MapNodes (although we may keep MapNodes as a special subclass.

This all sounds perfect (also keeping the MapNodes for backwards compatibility).

i'm currently writing it with the notion that what a node runs can be a python function, nipype interface, a web service or any object that allows specification of an input, output, and how to run it.

That's awesome

i'm trying to therefore simplify what the engine takes care of. but it still needs to know things about scattering (iterables, iternode) and gathering (mapnode, joinnode).

Yes, actually, I would remove these responsibilities from the interfaces. As I see this, I think it is great to have Workflows and Interfaces, but they should implement the same base interface. I clearly see the responsibilities of scattering and gathering in the running engine. I don't know how this would affect Workflows thought (since their run method is totally attached to the runner plugin).

@oesteban oesteban reopened this Mar 21, 2016
@oesteban
Copy link
Contributor Author

Sorry I didn't mean to close it.

@mwaskom
Copy link
Member

mwaskom commented Nov 13, 2017

Apparently the Jupyter perspective discourages other packages from using traitlets: https://twitter.com/minrk/status/927580767361912833

@satra
Copy link
Member

satra commented Nov 13, 2017

@mwaskom - that's interesting and weird given how much the jupyter widgets depend on that library. i'll have to chat with @minrk about it's future. we have expended a fair bit of effort to transition over to traitlets. @djarecka has gone through one round of conversion, learned the lessons, and starting again, so this would be a good time to figure this out.

@djarecka
Copy link
Collaborator

@mwaskom thanks for sharing, it feels a bit like it's a semi-private project. @satra - I can read about the other libraries mentioned in the post and we can talk about it soon

@minrk
Copy link

minrk commented Nov 14, 2017

My tweet may have been a bit strong. I wouldn't necessarily discourage anyone from using traitlets, but for a new project I also wouldn't advocate it, as there are coming newer tools that offer more modern approaches without the burden of traitlets' history. If you are using it and it doesn't cause problems for you, there's nothing wrong with sticking with it. What I would strongly discourage, and is the source of the "semi-private" comment, is using traitlets if serves your case pretty well, but needs some changes/improvements to really be a good fit.

In particular, the challenge of working with traitlets is its immense amount of inertia, due to the combination of underpinning all of IPython and Jupyter (widgets included), combined with being a super complicated array of metaclasses and descriptors, where it seems almost any change breaks something. For this reason, anyone interested in one small fix/improvement to traitlets will likely be frustrated, either because it's very difficult to get the fix in in the first place without breaking anything, or because there is a long wait for a release (it has been over a year since the last traitlets release).

@satra
Copy link
Member

satra commented Nov 14, 2017

@Mirk - fair enough point. when you say:

newer tools that offer more modern approaches

what sort of packages are you referring to.

we considered using protobuf, but we would still have to build a lot of functionality on top of it. in our case, we are moving from traits, which is deeply embedded all through our package and still has issues that have not been resolved in years. but it works - we have hacked around it - our primary reason to move to traitlets was its pure python code base, and ipython/jupyter's dependence on it. we thought it would be a better choice.

thus far, we investigated/coded traitlets into our code base to see where it would map to traits. after this work we have decided to create a shim layer on top of traitlets that would map to current traits api. but if you have thoughts on other tools that we should consider, we can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants