Skip to content

Undesired behaviour using to_csv with QUOTE_NONE #10783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pronojitsaha opened this issue Aug 10, 2015 · 4 comments
Closed

Undesired behaviour using to_csv with QUOTE_NONE #10783

pronojitsaha opened this issue Aug 10, 2015 · 4 comments
Labels
IO CSV read_csv, to_csv Usage Question

Comments

@pronojitsaha
Copy link

Hi,
My data frame contains unicode strings. Now when I use 'to_csv' without any quoting parameter, I get a csv file where certain fields (2 out of 11 to be exact) are in double quotes. For example:
28,"May 8, 2013 10:22 AM","76th St, Yates Blvd",41.75717,-87.56633,3,-,1,4,-
Now when I use csv.QUOTE_NONE i get the following error:
Error: need to escape, but no escapechar set
So i do the following (quoting=csv.QUOTE_NONE, escapechar='"'), and I get this:
28,May 8", 2013 10:22 AM,76th St", Yates Blvd,41.75717,-87.56633,3,-,1,4,-
So I get a new quote (") in those 2 fields. How to get around this problem? Thanks.

Sample code is below:

id = [u'1']
date = [u'Jan 12, 2013 08:30 AM']
location = [u'Ashland Avenue, Polk Street']
vehicles = [u'4']
drunken_persons = [u'-']
fatalities = [u'3']
persons = [u'4']
pedestrians = [u'-']

data = pd.DataFrame({'#':id, 'Date':date, 'Vehicles':vehicles, 'Drunken Persons':drunken_persons, 'Fatalities':fatalities, 'Persons':persons,  'Pedestrians':pedestrians, 'Location':location})

data[['#','Date','Location','Vehicles','Drunken Persons','Fatalities','Persons', 'Pedestrians']].astype(str).to_csv('Test.csv',  sep=',', index = False, quoting=csv.QUOTE_NONE, escapechar='"')
@kawochen
Copy link
Contributor

Maybe consider not using commas in your data or using something other than commas (like pipes) to delimit your file?

@jorisvandenbossche
Copy link
Member

This is the expected behaviour, since you explicitly told to_csv to use that quote the escape the comma's using escapechar='"' (docstring explanation of escapechar: "One-character string used to escape delimiter when quoting is QUOTE_NONE.").

E.g. using a backslash to escape the comma:

In [123]: print data.to_csv(quoting=csv.QUOTE_NONE, escapechar='\\')
,#,Date,Drunken Persons,Fatalities,Location,Pedestrians,Persons,Vehicles
0,1,Jan 12\, 2013 08:30 AM,-,3,Ashland Avenue\, Polk Street,-,4,4

What do you exactly want to obtain?

@jorisvandenbossche jorisvandenbossche added this to the No action milestone Aug 11, 2015
@jorisvandenbossche
Copy link
Member

I am closing this issue as it is not a bug, but we can certainly discuss further what you are looking for.

@pronojitsaha
Copy link
Author

@kawochen using '|' as delimiter solved the problem. I got the following output:

1|Jan 12, 2013 08:30 AM|-|3|Ashland Avenue, Polk Street|-|4|4

@jorisvandenbossche I understood the mistake in trying to use ',' as the delimiter since one of the fields was already having ',' in the data field, hence creating the difficulty with csv.QUOTE_NONE.

Thanks guys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Usage Question
Projects
None yet
Development

No branches or pull requests

3 participants