read_csv newline fix #10023

jblackburne · 2015-04-29T19:37:05Z

Slight change to the logic in the tokenize_delimited() and tokenize_delim_customterm() functions of the C parser.

Fixes #10022.

I believe the new logic is correct, but perhaps someone with more familiarity can double-check.

jreback · 2015-04-29T22:15:11Z

you need a test. As not really sure what you are fixing.

jblackburne · 2015-04-29T23:11:31Z

Ok, test forthcoming.

jblackburne · 2015-04-30T00:18:11Z

There is a small self-contained test in the comments on issue #10022. Would it be desirable to make it into a unit test? It takes about a second to run on my machine.

jreback · 2015-04-30T00:21:18Z

yep
the idea would be to add the test ; it fails on master; after your fix everything passes

jblackburne · 2015-05-07T16:40:06Z

Unit test added. Any further comments?

jreback · 2015-05-07T16:41:59Z

pandas/io/tests/test_cparser.py

@@ -359,6 +359,11 @@ def test_empty_field_eof(self):
                          names=list('abcd'), engine='c')
            assert_frame_equal(df, c)

+    def test_chunk_begins_with_newline_whitespace(self):
+        data = '\n hello\nworld\n'


add the issue number as a comment here

jreback · 2015-05-07T16:43:02Z

even though the fix is only in the c-parser, IIRC, this should work in python parser as well? hence pls move the tests to test_parser.py which will tests with all parsers.

jreback · 2015-05-07T17:55:33Z

@jblackburne looks good.

pls add a release note in v0.16.1
pls squash
ping when green.
here for detailed git instructions if you need

jblackburne · 2015-05-07T18:29:41Z

Squash all into a single commit?

shoyer · 2015-05-07T18:30:25Z

@jblackburne Yes, that's what @jreback is asking for

…that start with newline. Changed a condition in tokenize_delim_customterm to account for data chunks that start with terminator. Added a unit test that fails in master and passes in this branch. Moved new unit test in order to test all parser engines. Added GH issue number. Added release note.

jreback · 2015-05-07T20:44:38Z

pls take a look

cc @evanpw
cc @mdmueller
cc @selasley

evanpw · 2015-05-07T22:59:47Z

The logic looks right to me.

jreback · 2015-05-08T00:08:50Z

@jblackburne pls ping when this is green.

jblackburne · 2015-05-08T02:24:41Z

Ok, green.

read_csv newline fix

jreback added the IO CSV read_csv, to_csv label Apr 30, 2015

jreback reviewed May 7, 2015
View reviewed changes

jreback added this to the 0.17.0 milestone May 7, 2015

jreback modified the milestones: 0.16.1, 0.17.0 May 7, 2015

jblackburne force-pushed the read_csv-newline-chunk branch from 8f37413 to e693c3a Compare May 7, 2015 18:46

jreback added a commit that referenced this pull request May 8, 2015

Merge pull request #10023 from jblackburne/read_csv-newline-chunk

2840bea

read_csv newline fix

jreback merged commit 2840bea into pandas-dev:master May 8, 2015

jblackburne deleted the read_csv-newline-chunk branch May 9, 2015 00:42

jlec mentioned this pull request May 15, 2015

Use tz.gettz() instead of zoneinfo.gettz() #9123

Merged

This was referenced Sep 15, 2015

BUG: read_csv, engine='c' error #9735

Closed

TST: Verify fix for buffer overflow in read_csv with engine='c' #11138

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

read_csv newline fix #10023

read_csv newline fix #10023

Uh oh!

jblackburne commented Apr 29, 2015

Uh oh!

jreback commented Apr 29, 2015

Uh oh!

jblackburne commented Apr 29, 2015

Uh oh!

jblackburne commented Apr 30, 2015

Uh oh!

jreback commented Apr 30, 2015

Uh oh!

jblackburne commented May 7, 2015

Uh oh!

jreback May 7, 2015

Uh oh!

jreback commented May 7, 2015

Uh oh!

jreback commented May 7, 2015

Uh oh!

jblackburne commented May 7, 2015

Uh oh!

shoyer commented May 7, 2015

Uh oh!

jreback commented May 7, 2015

Uh oh!

evanpw commented May 7, 2015

Uh oh!

jreback commented May 8, 2015

Uh oh!

jblackburne commented May 8, 2015

Uh oh!

Uh oh!

Uh oh!

read_csv newline fix #10023

read_csv newline fix #10023

Uh oh!

Conversation

jblackburne commented Apr 29, 2015

Uh oh!

jreback commented Apr 29, 2015

Uh oh!

jblackburne commented Apr 29, 2015

Uh oh!

jblackburne commented Apr 30, 2015

Uh oh!

jreback commented Apr 30, 2015

Uh oh!

jblackburne commented May 7, 2015

Uh oh!

jreback May 7, 2015

Choose a reason for hiding this comment

Uh oh!

jreback commented May 7, 2015

Uh oh!

jreback commented May 7, 2015

Uh oh!

jblackburne commented May 7, 2015

Uh oh!

shoyer commented May 7, 2015

Uh oh!

jreback commented May 7, 2015

Uh oh!

evanpw commented May 7, 2015

Uh oh!

jreback commented May 8, 2015

Uh oh!

jblackburne commented May 8, 2015

Uh oh!

Uh oh!