Skip to content

ENH: xz compression_args df.to_pickle(path, compression={'method':'xz','pr… #53443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 2, 2023

Conversation

segatrade
Copy link
Contributor

@segatrade segatrade commented May 29, 2023

…eset': level})
level = 9

segatrade and others added 3 commits May 29, 2023 22:09
pandas/io/common.py:829:89: E501 Line too long (103 > 88 characters)
@segatrade segatrade changed the title xz compression_args df.to_pickle(path, compression={'method':'xz','pr… ENH: xz compression_args df.to_pickle(path, compression={'method':'xz','pr… May 29, 2023
@twoertwein
Copy link
Member

Thank you for the PR. You definitely will need the following:

  • tests
  • update the documentation
    ``tarfile.TarFile``, respectively.
    (and for decompression below that)
  • and a whatsnew entry: doc/source/whatsnew/v2.1.0.rst

segatrade added 3 commits May 30, 2023 09:51
pandas/io/common.py:830: error: Argument 1 to "LZMAFile" has incompatible type "Union[str, BaseBuffer]"; expected "Optional[Union[Union[str, bytes, PathLike[str], PathLike[bytes]], IO[bytes]]]"  [arg-type]
pandas/io/common.py:831: error: Unused "type: ignore" comment
@segatrade
Copy link
Contributor Author

segatrade commented May 30, 2023

Thank you for advices and maintaining pandas! Seems done with checks and docs. Do I need add any new tests or it's ready?

@twoertwein
Copy link
Member

twoertwein commented May 30, 2023

Thank you for advices and maintaining pandas! Seems done with checks and docs. Do I need add any new tests or it's ready?

Yes, you will need a test. Probably in https://github.com/pandas-dev/pandas/blob/main/pandas/tests/io/test_compression.py I'm not sure how to best test that the keywords are being forwarded: maybe one test with either to_pickle/read_csv and just pass a valid argument and then another test that passes an invalid keyword that raises an error in lzma to ensure the keyword is forwarded to it.

edit: should probably test both to_pickle and read_csv just to cover one read/write method.

@twoertwein twoertwein added the IO Data IO issues that don't fit into a more specific label label May 30, 2023
segatrade and others added 5 commits June 2, 2023 13:38
@twoertwein twoertwein requested a review from phofl June 2, 2023 13:08
@segatrade
Copy link
Contributor Author

Thank you! Done. Seems now everything is ready for review.

@mroeschke mroeschke added this to the 2.1 milestone Jun 2, 2023
@mroeschke mroeschke merged commit 14d7b96 into pandas-dev:main Jun 2, 2023
@mroeschke
Copy link
Member

Thanks @segatrade

topper-123 pushed a commit to topper-123/pandas that referenced this pull request Jun 5, 2023
…','pr… (pandas-dev#53443)

* xz compression_args df.to_pickle(path, compression={'method':'xz','preset': 9})

* fix Line too long

pandas/io/common.py:829:89: E501 Line too long (103 > 88 characters)

* fix black

* xz compression_args docs

* ruff 458:89: E501 Line too long

* fix mypy Failed
pandas/io/common.py:830: error: Argument 1 to "LZMAFile" has incompatible type "Union[str, BaseBuffer]"; expected "Optional[Union[Union[str, bytes, PathLike[str], PathLike[bytes]], IO[bytes]]]"  [arg-type]
pandas/io/common.py:831: error: Unused "type: ignore" comment

* xz compression tests
docs ench update

* Sphinx lint & black fix

* additional assert in test
sort whatsnew entries alphabetically & black fix

* fix test obj

* fix path test

* removed test_xz_compression_invalid_args , fix read_csv(compression

---------

Co-authored-by: Se <--global>
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
…','pr… (pandas-dev#53443)

* xz compression_args df.to_pickle(path, compression={'method':'xz','preset': 9})

* fix Line too long

pandas/io/common.py:829:89: E501 Line too long (103 > 88 characters)

* fix black

* xz compression_args docs

* ruff 458:89: E501 Line too long

* fix mypy Failed
pandas/io/common.py:830: error: Argument 1 to "LZMAFile" has incompatible type "Union[str, BaseBuffer]"; expected "Optional[Union[Union[str, bytes, PathLike[str], PathLike[bytes]], IO[bytes]]]"  [arg-type]
pandas/io/common.py:831: error: Unused "type: ignore" comment

* xz compression tests
docs ench update

* Sphinx lint & black fix

* additional assert in test
sort whatsnew entries alphabetically & black fix

* fix test obj

* fix path test

* removed test_xz_compression_invalid_args , fix read_csv(compression

---------

Co-authored-by: Se <--global>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: df.to_pickle xz lzma preset=level for compression compresslevel
3 participants