Skip to content

Cleanup the pure Python pickle implementation #60755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
serhiy-storchaka opened this issue Nov 25, 2012 · 8 comments
Closed

Cleanup the pure Python pickle implementation #60755

serhiy-storchaka opened this issue Nov 25, 2012 · 8 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 16551
Nosy @jcea, @pitrou, @avassalotti, @serhiy-storchaka
Files
  • pickle_cleanup.patch
  • bench.py: Microbenchmark for function lookup optimization
  • pickle_cleanup_globals.patch
  • pickle_cleanup_3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2013-04-14.10:38:06.683>
    created_at = <Date 2012-11-25.12:17:36.123>
    labels = ['type-feature', 'library']
    title = 'Cleanup the pure Python pickle implementation'
    updated_at = <Date 2013-04-14.10:38:06.660>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2013-04-14.10:38:06.660>
    actor = 'python-dev'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2013-04-14.10:38:06.683>
    closer = 'python-dev'
    components = ['Library (Lib)']
    creation = <Date 2012-11-25.12:17:36.123>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['28111', '29715', '29732', '29797']
    hgrepos = []
    issue_num = 16551
    keywords = ['patch']
    message_count = 8.0
    messages = ['176345', '182286', '186113', '186205', '186238', '186304', '186716', '186906']
    nosy_count = 5.0
    nosy_names = ['jcea', 'pitrou', 'alexandre.vassalotti', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue16551'
    versions = ['Python 3.4']

    @serhiy-storchaka
    Copy link
    Member Author

    This issue inspired by bpo-12848. The proposed patch get rid of the marshal module (the struct module used instead), removes some redundant code, and changes the code to use more modern idioms.

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Nov 25, 2012
    @serhiy-storchaka serhiy-storchaka self-assigned this Dec 27, 2012
    @serhiy-storchaka
    Copy link
    Member Author

    Ping.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 5, 2013

    Patch looks fine to me. Do you want to go ahead?

    @serhiy-storchaka
    Copy link
    Member Author

    In response to Alexandre's comment on Rietveld. Access to a local variable is faster than to a global one and the current implementation uses this for struct.pack. I just use same trick for struct.unpack. Here is a microbenchmark which demonstrate some effect of this optimization. I got 0.6491418619989417, 0.6947214259998873, and 0.5394902769985492 for optimized, non-optimized and advanced optimized functions.

    Of course, we can achieve even better effect if we will cache not only struct.pack, but struct.Struct('<i').pack, struct.Struct('B').pack, etc. I were considered this as a reason for other patch, but we can do it in this issue.

    @avassalotti
    Copy link
    Member

    My point is I would prefer that we keep all optimizations to only the _pickle C module and keep the Python implementation as simple as possible.

    Also, I doubt the slight speedup shown by your microbenchmark will actually result in any significant benefits on the overall performance of pickle.py. Furthermore, we can't use CPython to measure the benefits of such optimization because CPython will always use _pickle over the Python implementation.

    @serhiy-storchaka
    Copy link
    Member Author

    Here is an updated patch which get rid of trick with attribute caching. Globals used instead. Also added some minor style changes (dropped redundant semicolons and wrapped too long lines).

    @serhiy-storchaka
    Copy link
    Member Author

    Patch updated in response to Alexandre's comments. In additional to his suggestions some other minor things simplified. _batch_appends and _batch_setitems now use islice instead range. Some bugs found and new issues created (bpo-17710, bpo-17711).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 14, 2013

    New changeset 3dff836cedef by Serhiy Storchaka in branch 'default':
    Closes bpo-16551. Cleanup pickle.py.
    http://hg.python.org/cpython/rev/3dff836cedef

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants