-
Notifications
You must be signed in to change notification settings - Fork 11.7k
Fix Collection unique() to handle numeric strings correctly #57480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ad50a54 to
05e9a48
Compare
|
Shouldn't we drop Based on this warning:
Reference: https://www.php.net/manual/en/function.sort.php#refsect1-function.sort-notes This note is referenced from a comment within the bug report you mentioned: From my understanding of the PR's code, if any of the array's elements is not a scalar and not And if all of them are scalars, the But the note says:
So why keep it when the collection has any non-scalar values? |
|
I considered, but I don't know of an acceptable alternative for complex types. Honest truth is PHP needs to fix this in the source. But we have the benefit of a framework here :-) I found this issue also exists for |
…correctly Fixes bug where SORT_REGULAR incorrectly deduplicates numeric strings when a '+' prefix is present. Collections like ['+19495551234', '9495551234', '19495551234'] now correctly return all three items instead of incorrectly removing the 11-digit format. The fix checks if the collection contains complex types (arrays/objects) that require SORT_REGULAR, or scalar values where default array_unique() is faster and more correct. This works around a documented PHP limitation with SORT_REGULAR where numeric strings are compared as numbers in certain contexts, causing unexpected deduplication behavior. See: php/doc-en#1463 Related to: 48a53be
05e9a48 to
e9ae5e9
Compare
|
Would it make sense, to allow to pass the flags to |
I would argue that it shouldn't be necessary to pass the flag -- wouldn't feel like the "Laravel way." I think an argument can be made that I have to admit that even though my fix improves performance and passes all tests -- it breaks from the accepted behavior of An alternative could be this, which achieves an overall improvement of ~3x for public function unique($key = null, $strict = false)
{
if ($key === null && $strict === false) {
return new static(array_unique($this->items, SORT_REGULAR));
}elseif ($key === null && $strict === true) {
$hasComplexItems = false;
foreach ($this->items as $item) {
if (! is_scalar($item) && $item !== null) {
$hasComplexItems = true;
break;
}
}
if(! $hasComplexItems) {
return new static(array_unique($this->items, SORT_STRING));
}
}
$callback = $this->valueRetriever($key);
$exists = [];
return $this->reject(function ($item, $key) use ($callback, $strict, &$exists) {
if (in_array($id = $callback($item, $key), $exists, $strict)) {
return true;
}
$exists[] = $id;
});
} |
|
I want to put this one to rest. I think |
|
I doubt we change the behavior of unique() in any way on a patch release. |
Fixes bug where
SORT_REGULARcorrectly deduplicates numeric strings when a+prefix is present, causing mass confusion and hysteria, in a world where "unique" is meant to mean "unlike anything else" haha.The Bug
You can see the wild behavior of
SORT_REGULARhere: https://3v4l.org/UTnXH#v8.4.14This affects phone numbers in E.164 format -- and likely other important things forever lost in the universe.
Taylor, if you've read this far, thank you for being an amazing human being!
The PHP docs note the default sort behavior of
array_uniqueis effectively strict eg.(string) $elem1 === (string) $elem2. The PHP docs also note that "array_unique() is not intended to work on multidimensional arrays." However,SORT_REGULARbreaks from this documented behavior, and essentially provides a loose comparison that can be used with multi-dimensional arrays.Ultimately, I feel
Collection::unique()should effectively align with the default behavior ofarray_unique()when the array passed to theCollectionis not multidimensional. This is my attempt to do just that, in a BC and performant manner. I'll leave the proper multidimensional array unique function to the PHP gods.The Fix
Checks if the collection contains complex types (arrays/objects) requiring
SORT_REGULAR. If the Collection is one of scalar values, it will use the faster and more accuratearray_unique().Performance
Benchmark: https://gist.github.com/jmarble/3e7960efba8fef8814686d0f0db07bbf
References