Skip to content

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Jan 4, 2017

Raise ValueError or issue ParserWarning (as we do with other unsupported features) when a multi-char quotechar is passed in, and the C engine is used.

Closes #11592.

@gfyoung gfyoung force-pushed the multilen-quotechar-c-engine branch from c9e624d to a06b68b Compare January 4, 2017 06:24
@codecov-io
Copy link

codecov-io commented Jan 4, 2017

Current coverage is 84.76% (diff: 100%)

Merging #15050 into master will increase coverage by <.01%

@@             master     #15050   diff @@
==========================================
  Files           145        145          
  Lines         51146      51151     +5   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43350      43356     +6   
+ Misses         7796       7795     -1   
  Partials          0          0          

Powered by Codecov. Last update de09c98...68d69fe

@gfyoung gfyoung force-pushed the multilen-quotechar-c-engine branch from a06b68b to 61babcc Compare January 4, 2017 08:35
@gfyoung
Copy link
Member Author

gfyoung commented Jan 5, 2017

@jreback : Any comments on this PR? Otherwise, everything is green and ready to go.

@sinhrks sinhrks added Bug IO CSV read_csv, to_csv labels Jan 5, 2017
@@ -246,6 +246,7 @@ Other API Changes
- ``DataFrame.applymap()`` with an empty ``DataFrame`` will return a copy of the empty ``DataFrame`` instead of a ``Series`` (:issue:`8222`)

- ``pd.read_csv()`` will now issue a ``ParserWarning`` whenever there are conflicting values provided by the ``dialect`` parameter and the user (:issue:`14898`)
- ``pd.read_csv()`` will now raise a ``ValueError`` for the C engine if the quote character has a length greater than one (:issue:`11592`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First I misread as multiple characters issue. Should it say single byte character?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds more logical indeed. Done.

engine not in ('python', 'python-fwf')):
fallback_reason = ("the quotechar is > 1 char long, and "
"the 'c' engine does not support such "
"quotechars".format(encoding=encoding))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encoding is not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's nice to the reason is more clearer, like "quotechar must be a single byte character, which ord(quotechar) <= 127". pls correct my English:)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch fixed.

Raise ValueError or issue ParserWarning
when a multi-char quotechar is passed in,
and the C engine is used.

Closes pandas-devgh-11592.
@gfyoung gfyoung force-pushed the multilen-quotechar-c-engine branch from 61babcc to 68d69fe Compare January 5, 2017 10:54
@gfyoung
Copy link
Member Author

gfyoung commented Jan 5, 2017

@sinhrks : Addressed all of your comments, and everything is still green.

@jorisvandenbossche jorisvandenbossche added this to the 0.20.0 milestone Jan 5, 2017
@jorisvandenbossche jorisvandenbossche merged commit db08da2 into pandas-dev:master Jan 5, 2017
@gfyoung gfyoung deleted the multilen-quotechar-c-engine branch January 5, 2017 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants