Skip to content

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Apr 26, 2017

Previously, we were requiring that all file-like objects had "read," "write," "seek," and "tell" methods,
but that was too strict (e.g. read-only buffers). This commit relaxes those requirements to having EITHER "read" or "write" as attributes.

Closes #16135.

@gfyoung gfyoung force-pushed the is-file-like-relax branch from e2e510b to 57bc69d Compare April 26, 2017 19:25
@codecov
Copy link

codecov bot commented Apr 26, 2017

Codecov Report

Merging #16150 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16150      +/-   ##
==========================================
+ Coverage   90.83%   90.84%   +<.01%     
==========================================
  Files         159      159              
  Lines       50796    50794       -2     
==========================================
- Hits        46143    46142       -1     
+ Misses       4653     4652       -1
Flag Coverage Δ
#multiple 88.62% <100%> (ø) ⬆️
#single 40.31% <50%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/dtypes/inference.py 98.33% <100%> (-0.06%) ⬇️
pandas/core/common.py 91.03% <0%> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b80ed3...57bc69d. Read the comment docs.

@codecov
Copy link

codecov bot commented Apr 26, 2017

Codecov Report

Merging #16150 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16150      +/-   ##
==========================================
+ Coverage   90.83%   90.84%   +<.01%     
==========================================
  Files         159      159              
  Lines       50796    50794       -2     
==========================================
- Hits        46143    46142       -1     
+ Misses       4653     4652       -1
Flag Coverage Δ
#multiple 88.62% <100%> (ø) ⬆️
#single 40.31% <50%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/dtypes/inference.py 98.33% <100%> (-0.06%) ⬇️
pandas/core/common.py 91.03% <0%> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b80ed3...57bc69d. Read the comment docs.

@codecov
Copy link

codecov bot commented Apr 26, 2017

Codecov Report

Merging #16150 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16150      +/-   ##
==========================================
+ Coverage   90.85%   90.88%   +0.02%     
==========================================
  Files         159      159              
  Lines       50785    50783       -2     
==========================================
+ Hits        46141    46152      +11     
+ Misses       4644     4631      -13
Flag Coverage Δ
#multiple 88.66% <100%> (+0.02%) ⬆️
#single 40.32% <50%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/dtypes/inference.py 98.33% <100%> (-0.06%) ⬇️
pandas/core/indexes/datetimes.py 95.43% <0%> (+0.09%) ⬆️
pandas/plotting/_converter.py 65.35% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d9909c...d2efe18. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

my_s3_object = boto3.client('s3').get_object(Bucket=<S3 bucket>, Key=<S3 path>) pd.read_csv(my_s3_object['Body'])

should work with one of the s3 buckets (.e.g. s3://pandas-test/tips.csv), can you add a test as well (should fail prior to this PR, and work with).

@jreback jreback added the IO CSV read_csv, to_csv label Apr 27, 2017
@jreback jreback added this to the 0.20.0 milestone Apr 27, 2017
@gfyoung
Copy link
Member Author

gfyoung commented Apr 27, 2017

@jreback : Just to make sure I got it right, it's:

Bucket = "pandas-test"
Key = "pandas-test/tips.csv" (or is it just "/tips.csv", don't remember)

Also, do we install boto3 during testing? I don't see it explicitly in the codebase.

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

we already have s3 tests just not streaming ones.

(pandas) bash-3.2$ grep -R s3 pandas/tests/io/parser/
Binary file pandas/tests/io/parser//__pycache__/test_network.cpython-36-PYTEST.pyc matches
pandas/tests/io/parser//test_network.py:            import s3fs  # noqa
pandas/tests/io/parser//test_network.py:            pytest.skip("s3fs not installed")
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' +
pandas/tests/io/parser//test_network.py:        df = read_csv('s3://cant_get_it/tips.csv')
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3n_bucket(self):
pandas/tests/io/parser//test_network.py:        # Read from AWS s3 as "s3n" URL
pandas/tests/io/parser//test_network.py:        df = read_csv('s3n://pandas-test/tips.csv', nrows=10)
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3a_bucket(self):
pandas/tests/io/parser//test_network.py:        # Read from AWS s3 as "s3a" URL
pandas/tests/io/parser//test_network.py:        df = read_csv('s3a://pandas-test/tips.csv', nrows=10)
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_nrows(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' +
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_chunked(self):
pandas/tests/io/parser//test_network.py:            df_reader = read_csv('s3://pandas-test/tips.csv' + ext,
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_chunked_python(self):
pandas/tests/io/parser//test_network.py:            df_reader = read_csv('s3://pandas-test/tips.csv' + ext,
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_python(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' + ext, engine='python',
pandas/tests/io/parser//test_network.py:    def test_infer_s3_compression(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' + ext,
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_nrows_python(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' + ext, engine='python',
pandas/tests/io/parser//test_network.py:    def test_s3_fails(self):
pandas/tests/io/parser//test_network.py:            read_csv('s3://nyqpug/asdf.csv')
pandas/tests/io/parser//test_network.py:            read_csv('s3://cant_get_it/')

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

you can import boto3 its a dependency of s3fs which makes a nicer layer on top.

@gfyoung
Copy link
Member Author

gfyoung commented Apr 27, 2017

I was asking about the proper syntax for calling get_object (i.e. what are the bucket and key names exactly just to make sure).

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

bucket is pandas-test, path is \tip.csv I think.

@gfyoung gfyoung force-pushed the is-file-like-relax branch from 57bc69d to cd68e91 Compare April 27, 2017 03:04
@gfyoung
Copy link
Member Author

gfyoung commented Apr 27, 2017

@jreback : boto test has been added, and everything is green.

Previously, we were requiring that
all file-like objects had "read,"
"write," "seek," and "tell" methods,
but that was too strict (e.g. read-only
buffers). This commit relaxes those
requirements to having EITHER "read"
or "write" as attributes.

Closes pandas-devgh-16135.
@gfyoung gfyoung force-pushed the is-file-like-relax branch from cd68e91 to d2efe18 Compare April 27, 2017 07:11
# see gh-16135

# boto3 is a dependency of s3fs
import boto3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a skip if s3fs is not installed?

Copy link
Member Author

@gfyoung gfyoung Apr 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. There's a setUp method in this test class which skips on that exact condition.

@jreback jreback merged commit a16fc8d into pandas-dev:master Apr 27, 2017
@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

ok then, thanks!

@gfyoung gfyoung deleted the is-file-like-relax branch April 27, 2017 13:41
pcluo pushed a commit to pcluo/pandas that referenced this pull request May 22, 2017
Previously, we were requiring that
all file-like objects had "read,"
"write," "seek," and "tell" methods,
but that was too strict (e.g. read-only
buffers). This commit relaxes those
requirements to having EITHER "read"
or "write" as attributes.

Closes pandas-devgh-16135.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants