-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Basis for a StringDtype using Arrow #35259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jorisvandenbossche
merged 91 commits into
pandas-dev:master
from
xhochy:arrow-string-array
Nov 20, 2020
Merged
Changes from all commits
Commits
Show all changes
91 commits
Select commit
Hold shift + click to select a range
4c2e37a
Implement BaseDtypeTests for ArrowStringDtype
xhochy d477ee7
Implement getitem
xhochy 206f493
Add basic copy implementation
xhochy d58dba6
Implement getitem for iterables
xhochy 7a9e2c3
Remove commented code
xhochy ffc4c0f
Implement more Setitem/Getitem variants
xhochy c1305ab
Review comments by @jorisvandenbossche
xhochy 13a42f7
Add Arrow issue numbers
xhochy decd022
Adopt to kernel renamings
xhochy 3145e44
Handle take(indices<0, allow_fill=False)
xhochy e22b348
Handle fill_value better
xhochy 4b8108c
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 2446562
fix doctest
simonjayhawkins a0dcc85
Revert "fix doctest"
simonjayhawkins 5c42173
change version for versionadded
simonjayhawkins 28c3ef2
code checks
simonjayhawkins 4044d4c
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 1740524
skip tests for pyarrow<1.0
simonjayhawkins e9bb36f
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 8ad120b
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 34bf57d
raise ImportError in constructors on pyarrow < 1.0.0. or not installed
simonjayhawkins f92241e
remove size, shape and ndim
simonjayhawkins c09382d
activate all extension array tests
simonjayhawkins bac64c1
string array tests
simonjayhawkins 0956147
Update pandas/core/arrays/string_arrow.py
simonjayhawkins 963e1cf
add a to_numpy() method and use from __array__
simonjayhawkins 87b8e67
mypy fixup
simonjayhawkins 1ed0585
remove workaround for ARROW-9407 and ci test on pyarrow=1.0.0
simonjayhawkins fa954f7
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 82b84bf
add _dtype class attribute
simonjayhawkins b1a3032
remove redundant integer indexing OOB and negative indexing checks in…
simonjayhawkins 08d34f4
check pyarrow array is string type in constructor
simonjayhawkins ae49807
basic _from_factorized pending discussion on performant factorisation
simonjayhawkins 2e5d4c7
update constructor error message and move test
simonjayhawkins c8318cc
add _concat_same_type classmethod
simonjayhawkins 1a200a2
_as_pandas_scalar to method
simonjayhawkins e10be80
copy/paste fillna from fletcher as baseline (29 failed)
simonjayhawkins c1d3087
minor cleanup of fillna (29 failed)
simonjayhawkins 34f563d
correct mistake in previous commit (25 failed)
simonjayhawkins f5fc4fd
add OpsMixin (23 failed)
simonjayhawkins a5a7c85
add binops (18 failed)
simonjayhawkins f651563
return Boolean array for comparison ops (12 failed)
simonjayhawkins f5419b9
fix ValueError: zero-size array to reduction operation maximum which …
simonjayhawkins 3af5ce0
copy/paste value_counts from fletcher as baseline (5 failed)
simonjayhawkins bdf4ad2
tidy imports
simonjayhawkins e044c7f
fix test_take_non_na_fill_value (4 failed)
simonjayhawkins c5625a8
fix test_take_pandas_style_negative_raises (3 failed)
simonjayhawkins 50889fb
parametrize string extension tests (3 failed)
simonjayhawkins 0e1773b
xfail other 2 tests expecting views (1 failed)
simonjayhawkins 7bb9574
add ensure_string_array to _from_sequence (1 failed)
simonjayhawkins fc45ef7
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 51d7d0a
Apply suggestions from code review
simonjayhawkins bd76a75
Merge branch 'arrow-string-array' of github.com:xhochy/pandas into ar…
simonjayhawkins 3cf5c91
return NotImplemented in comparisons (7 failed)
simonjayhawkins 07239a0
move arrow function lookup dict to module scope (7 failed)
simonjayhawkins 9a7cfc5
remove isinstance(other, (ABCSeries, ABCDataFrame, ABCIndex)) check
simonjayhawkins 2ba0dcd
remove na_value=cls._dtype.na_value from ensure_string_array call (7 …
simonjayhawkins 97c56e2
coloate _from_sequence_of_strings with _from_sequence (7 failed)
simonjayhawkins d6d3543
revert change to extra_compile_args in setup.py
simonjayhawkins ab40dce
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins d71a895
sync fillna docstring with base
simonjayhawkins f342b62
Apply suggestions from code review
simonjayhawkins 3d05c89
Merge branch 'arrow-string-array' of github.com:xhochy/pandas into ar…
simonjayhawkins b3c6347
other base.Base*Tests -> super()
simonjayhawkins 26bca25
len(item) == 0 -> not len(item)
simonjayhawkins 9579444
update copy docstring and return type
simonjayhawkins 88094a7
test_constructor_not_string_type_raises with np.ndarray
simonjayhawkins ba0cee8
update test_from_sequence_no_mutate (7 failed)
simonjayhawkins 6709ac3
change xfail message for base extension array tests (7 failed)
simonjayhawkins 11388b4
change xfail reason message in test_value_counts_na
simonjayhawkins eb284e7
skip test_memory_usage for ArrowStringArray
simonjayhawkins 27ce19a
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 9b70709
part implementation of na_value in to_numpy
simonjayhawkins 6757feb
remove is_array_like in __getitem__
simonjayhawkins 460ea38
Revert "remove is_array_like in __getitem__"
simonjayhawkins 7bee5e2
remove just is_array_like in __getitem__
simonjayhawkins 91f3763
Update pandas/core/arrays/string_arrow.py
simonjayhawkins 36b662a
Apply suggestions from code review
simonjayhawkins 7a9ef9c
lint fixup
simonjayhawkins 5db8788
xfail test_astype_roundtrip
simonjayhawkins c76c39f
update expected in test_arrow_array
simonjayhawkins 87b7863
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 24a782d
add fallback for scalar comparison ops
simonjayhawkins 353bff9
dispatch to pyarrow for comparion with np.ndarray (1 failed)
simonjayhawkins be93947
fix test_reindex_non_na_fill_value
simonjayhawkins 11eb08f
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins 52440a7
use fill_mask in pa indices_array
simonjayhawkins bd05c2c
add comment to __gettem__
simonjayhawkins 27c8de5
add comment on pyarrow compute
simonjayhawkins b6713e9
privatize `data`
simonjayhawkins 125cb6f
Merge remote-tracking branch 'upstream/master' into arrow-string-array
simonjayhawkins File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.