-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Problem description
As a result of the discussion in issue #27394, read_csv is changed such that it raises an error when both header and prefix are different from None. A user had misunderstood how to (not) use header and prefix together. I think that the usage of namesand prefix can be misunderstood in a similar way.
It could also be that a user accidentally provides both arguments and expects prefix to have effect. Right now, it seems like prefix is silently ignored when names is provided.
Describe the solution you'd like
Raise a ValueError when the read_csv arguments prefix and names both differ from None, in accordance with issue #27394 and pull request #31383.
API breaking implications
This will "break" code that passes values (!=None) for both prefix and names, but since it was an accepted solution for issue #27394, I think it could be used here as well.
Describe alternatives you've considered
Another possibility is to issue a warning instead of an error, but that would be inconsistent with the behavior when prefix and header is both not None.
Additional context
Examples below are run in pandas version 1.3.0.dev0+210.g9f1a41dee.
When I run
pandas.read_csv("my_data.csv", prefix="XZ", header=0)I get the following output:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 605, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 457, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 814, in __init__
self._engine = self._make_engine(self.engine)
File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1045, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1853, in __init__
ParserBase.__init__(self, kwds)
File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1334, in __init__
raise ValueError(
ValueError: Argument prefix must be None if argument header is not None
but running
pandas.read_csv("my_data.csv", prefix="XZ", names=["a", "b", "c"])works fine, except that prefix is silently ignored.