-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
add data_columns to doc string #13065
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
not sure what you mean by this. pandas has tests for each option of course. doc-strings in the past were not audited as much as currently. The doc-string should match for the same keyword args. Please change to match.
|
@@ -1084,6 +1084,9 @@ def to_hdf(self, path_or_buf, key, **kwargs): | |||
/ selecting subsets of the data | |||
append : boolean, default False | |||
For Table formats, append the input data to the existing | |||
data_columns : list of columns to create as data columns, or True to | |||
use all columns. This will create additional indexed columns for | |||
on-disk queries, by default only 'index' and 'columns' are indexed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is 'columns'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same columns
as written in your docs? See here: http://pandas.pydata.org/pandas-docs/stable/io.html#querying-a-table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by default the axes of the object are indexed (e.g. index
for Series, index
and columns
for DataFrame)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it's just not really clear from the docstring (but it would also take to much space there to explain).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it should be improved in the io docs?
|
you can expand the |
I just noticed that columns in HDFStore.append() is not supported anymore: if columns is not None:
raise TypeError("columns is not a supported keyword in append, "
"try data_columns") Should the |
@michaelaye what does |
That's how it looks like, yes. But it's still there, in the method signature, see above, where you copied it in yourself. |
Current coverage is 84.14%@@ master #13065 diff @@
==========================================
Files 138 137 -1
Lines 50587 50261 -326
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
- Hits 42593 42288 -305
+ Misses 7994 7973 -21
Partials 0 0
|
@@ -1084,6 +1084,9 @@ def to_hdf(self, path_or_buf, key, **kwargs): | |||
/ selecting subsets of the data | |||
append : boolean, default False | |||
For Table formats, append the input data to the existing | |||
data_columns : list of columns to create as data columns, or True to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data_columns : list of columns, or True, default None
...explanation on next line
What about the obsolete "columns" keyword in the Store.append call()? Different problem I guess, but I don't really understand why to keep it? Not at least for backwards compatibility, because it raises a TypeError now, and so it would if the keyword would be removed and somebody still would try to use it. |
its to provide a custom message with a hint, meaning
|
This looks good to me. @jreback ? |
data_columns : list of columns, or True, default None | ||
This will create additional indexed columns for on-disk queries, | ||
by default only 'index' and 'columns' are indexed. True will index | ||
all columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only true for a data frame (and not true for other objects); by default the axes are indexes. e.g. index
and columns
might be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm sorry, it's not clear what exactly you want changed how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a generic doc-string that applies when you are looknig at Series,DataFrame, or Panel. so it should either be somewhat generic or specific to that type (which is harder and not worth it). You are saying 'index' and 'columns', which don't apply to Series/Panel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we make this specific to Series,DataFrame and Panel? Because considering the complexity of pandas I find it highly important to have the docstrings as useful as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this case it might not be easy to do this because someone is doing:
``HDFStore.append, there isn't any object at that point, so you can just give an example e.g.
Series -> 'index' is defined, DataFrame -> 'index' and 'columns' ,Panel -> 'items', 'major_axis','minor_axis'` (put them in bullet points).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, for a Panel
I actually can not add columns to the index?
If that's true, how about then I add the docstring like so:
data_columns : only applicable to DataFrames, see examples
and then I add the usage examples with the suggestions you made above?
can you rebase / update? |
I see the problem now, maybe this is the reason why the docstring was left out in the first place? What should I do? I guess the |
can you update |
I repeat my previous question from May 25 (under the code discussion at the change overview), for which I still am waiting for guidance: "So, for a Panel I actually can not add columns to the index?
and then I add the usage examples with the suggestions you made above? |
@michaelaye I dont' think can use a shared doc here as this is a doc-string on the class (and not on an object yet). So put a more general doc-string should be enough (e.g. give an example for series, dataframe, panel for which axes you can use). |
@michaelaye I cherry-picked your commit to combine it with the changes of the other merged PR |
-> rebased in #14046 |
git diff upstream/master | flake8 --diff
How about this for #13061 ?
I don't think pandas has tests for each documented field, correct?
Do you want a whatsnew entry for this? If yes, how does that work? The guidelines don't talk about that.