Skip to content

Fix correct warning with c engine when skipping lines #16455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pankajp
Copy link
Contributor

@pankajp pankajp commented May 23, 2017

Fixed bug where c engine would not print warnings for lines
it skipped in case the skipped line had an inline comment.
Also, its accounting of number of fields in such lines would
be off by one.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Can you also add a release note in doc/source/whatsnew/0.20.2.txt in the bug fixes section.

@@ -116,3 +118,29 @@ def test_commment_first_line(self):
expected = DataFrame({0: ['a', '1'], 1: ['b', '2'], 2: ['c', '3']})
result = self.read_csv(StringIO(data), comment='#', header=None)
tm.assert_frame_equal(result, expected)

def test_comment_whitespace_delimited(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the tm.capture_stderr decorator from pandas/util/testing.py so you don't have to do the patching yourself. See

def test_skip_bad_lines(self):
for an example.

@jreback
Copy link
Contributor

jreback commented May 23, 2017

does this have an issue number?

@jreback jreback added the IO CSV read_csv, to_csv label May 23, 2017
@pankajp
Copy link
Contributor Author

pankajp commented May 24, 2017

@TomAugspurger I have added the whatsnew bug entry and replaced sys.stderr patching with tm.capture_stderr, and also moved the test to CParserTests since it is only applicable to the c engine.
The CI tests seem to fail on some pyarrow/feather code.

@jreback Added an issue for this: #16472

@jreback
Copy link
Contributor

jreback commented May 24, 2017

can you rebase on master, CI has been fixed

Pankaj Pandey added 4 commits May 24, 2017 16:12
Fixed bug where c engine would not print warnings for lines
it skipped in case the skipped line had an inline comment.
Also, its accounting of number of fields in such lines would
be off by one.
The behavior is only applicable on the `c` engine.
@pankajp pankajp force-pushed the fix-c-engine-comment-inline-warning branch from 468d68b to 44f4f00 Compare May 24, 2017 10:43
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, minor comment.

@@ -53,6 +53,7 @@ Indexing
I/O
^^^

- Bug in skipping error lines with inline comments in space delimited text files (:issue:`16472`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like this: Bug in pd.read_csv() when comment is passed in space deliminted text files.

@jreback
Copy link
Contributor

jreback commented May 24, 2017

@gfyoung if you can have a look

@jreback jreback added this to the 0.20.2 milestone May 24, 2017
@jreback jreback added the Bug label May 24, 2017
@codecov
Copy link

codecov bot commented May 24, 2017

Codecov Report

Merging #16455 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #16455   +/-   ##
=======================================
  Coverage   90.42%   90.42%           
=======================================
  Files         161      161           
  Lines       51029    51029           
=======================================
  Hits        46144    46144           
  Misses       4885     4885
Flag Coverage Δ
#multiple 88.26% <ø> (ø) ⬆️
#single 40.17% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6cbd558...6fd269d. Read the comment docs.

@codecov
Copy link

codecov bot commented May 24, 2017

Codecov Report

Merging #16455 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #16455   +/-   ##
=======================================
  Coverage   90.42%   90.42%           
=======================================
  Files         161      161           
  Lines       51029    51029           
=======================================
  Hits        46144    46144           
  Misses       4885     4885
Flag Coverage Δ
#multiple 88.26% <ø> (ø) ⬆️
#single 40.17% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6cbd558...6fd269d. Read the comment docs.

@gfyoung
Copy link
Member

gfyoung commented May 24, 2017

LGTM!

@TomAugspurger TomAugspurger merged commit 97ad3fb into pandas-dev:master May 24, 2017
@TomAugspurger
Copy link
Contributor

Thanks @pankajp!

@pankajp pankajp deleted the fix-c-engine-comment-inline-warning branch May 25, 2017 06:17
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request May 29, 2017
…as-dev#16455)

* Fix correct warning with c engine when skipping lines

Fixed bug where c engine would not print warnings for lines
it skipped in case the skipped line had an inline comment.
Also, its accounting of number of fields in such lines would
be off by one.

* Use `tm.capture_stderr` to capture stderr

* Add bug fix note in `whatsnew/v0.20.3.txt`

* Move test to CParserTests

The behavior is only applicable on the `c` engine.

* Update whatsnew bug entry as per review

(cherry picked from commit 97ad3fb)
TomAugspurger pushed a commit that referenced this pull request May 30, 2017
* Fix correct warning with c engine when skipping lines

Fixed bug where c engine would not print warnings for lines
it skipped in case the skipped line had an inline comment.
Also, its accounting of number of fields in such lines would
be off by one.

* Use `tm.capture_stderr` to capture stderr

* Add bug fix note in `whatsnew/v0.20.3.txt`

* Move test to CParserTests

The behavior is only applicable on the `c` engine.

* Update whatsnew bug entry as per review

(cherry picked from commit 97ad3fb)
stangirala pushed a commit to stangirala/pandas that referenced this pull request Jun 11, 2017
…as-dev#16455)

* Fix correct warning with c engine when skipping lines

Fixed bug where c engine would not print warnings for lines
it skipped in case the skipped line had an inline comment.
Also, its accounting of number of fields in such lines would
be off by one.

* Use `tm.capture_stderr` to capture stderr

* Add bug fix note in `whatsnew/v0.20.3.txt`

* Move test to CParserTests

The behavior is only applicable on the `c` engine.

* Update whatsnew bug entry as per review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect skipping of lines with inline comments and printing warnings
4 participants