Skip to content

Commit ab3291d

Browse files
Sereger13jreback
authored andcommitted
BUG: Bug in .read_csv with dtype specified on empty data producing an error
closes #12048
1 parent f4f74f9 commit ab3291d

File tree

3 files changed

+19
-2
lines changed

3 files changed

+19
-2
lines changed

doc/source/whatsnew/v0.18.0.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -531,12 +531,12 @@ of columns didn't match the number of series provided (:issue:`12039`).
531531

532532

533533
- Bug in ``.groupby`` where a ``KeyError`` was not raised for a wrong column if there was only one row in the dataframe (:issue:`11741`)
534+
- Bug in ``.read_csv`` with dtype specified on empty data producing an error (:issue:`12048`)
535+
- Bug in building *pandas* with debugging symbols (:issue:`12123`)
534536

535537

536538
- Removed ``millisecond`` property of ``DatetimeIndex``. This would always raise a ``ValueError`` (:issue:`12019`).
537539
- Bug in ``Series`` constructor with read-only data (:issue:`11502`)
538540

539541
- Bug in ``.loc`` setitem indexer preventing the use of a TZ-aware DatetimeIndex (:issue:`12050`)
540542
- Big in ``.style`` indexes and multi-indexes not appearing (:issue:`11655`)
541-
542-
- Bug in building Pandas with debugging symbols (:issue:`12123`)

pandas/io/parsers.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
from __future__ import print_function
55
from pandas.compat import range, lrange, StringIO, lzip, zip, string_types, map
66
from pandas import compat
7+
from collections import defaultdict
78
import re
89
import csv
910
import warnings
@@ -2264,6 +2265,8 @@ def _get_empty_meta(columns, index_col, index_names, dtype=None):
22642265
if dtype is None:
22652266
dtype = {}
22662267
else:
2268+
if not isinstance(dtype, dict):
2269+
dtype = defaultdict(lambda: dtype)
22672270
# Convert column indexes to column names.
22682271
dtype = dict((columns[k] if com.is_integer(k) else k, v)
22692272
for k, v in compat.iteritems(dtype))

pandas/io/tests/test_parsers.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -695,6 +695,14 @@ def test_passing_dtype(self):
695695
dtype={'A': 'timedelta64', 'B': 'float64'},
696696
index_col=0)
697697

698+
with tm.assertRaisesRegexp(ValueError,
699+
"The 'dtype' option is not supported"):
700+
701+
# empty frame
702+
# GH12048
703+
self.read_csv(StringIO('A,B'), dtype=str)
704+
705+
698706
def test_quoting(self):
699707
bad_line_small = """printer\tresult\tvariant_name
700708
Klosterdruckerei\tKlosterdruckerei <Salem> (1611-1804)\tMuller, Jacob
@@ -3588,6 +3596,12 @@ def test_passing_dtype(self):
35883596
self.assertRaises(TypeError, self.read_csv, path, dtype={'A': 'timedelta64', 'B': 'float64'},
35893597
index_col=0)
35903598

3599+
# empty frame
3600+
# GH12048
3601+
actual = self.read_csv(StringIO('A,B'), dtype=str)
3602+
expected = DataFrame({'A': [], 'B': []}, index=[], dtype=str)
3603+
tm.assert_frame_equal(actual, expected)
3604+
35913605
def test_dtype_and_names_error(self):
35923606

35933607
# GH 8833

0 commit comments

Comments
 (0)