Skip to content

Conversation

@jmarble
Copy link

@jmarble jmarble commented Oct 21, 2025

Fixes bug where SORT_REGULAR correctly deduplicates numeric strings when a + prefix is present, causing mass confusion and hysteria, in a world where "unique" is meant to mean "unlike anything else" haha.

The Bug

You can see the wild behavior of SORT_REGULAR here: https://3v4l.org/UTnXH#v8.4.14

This affects phone numbers in E.164 format -- and likely other important things forever lost in the universe.

Taylor, if you've read this far, thank you for being an amazing human being!

The PHP docs note the default sort behavior of array_unique is effectively strict eg. (string) $elem1 === (string) $elem2 . The PHP docs also note that "array_unique() is not intended to work on multidimensional arrays." However, SORT_REGULAR breaks from this documented behavior, and essentially provides a loose comparison that can be used with multi-dimensional arrays.

Ultimately, I feel Collection::unique() should effectively align with the default behavior of array_unique() when the array passed to the Collection is not multidimensional. This is my attempt to do just that, in a BC and performant manner. I'll leave the proper multidimensional array unique function to the PHP gods.

The Fix

Checks if the collection contains complex types (arrays/objects) requiring SORT_REGULAR. If the Collection is one of scalar values, it will use the faster and more accurate array_unique().

Performance

  • ~1.4x overall speedup (19,000 iterations)
  • ~5.9x faster for numeric strings
  • Same speed for arrays/objects

Benchmark: https://gist.github.com/jmarble/3e7960efba8fef8814686d0f0db07bbf

References

@jmarble jmarble force-pushed the fix-collection-unique-sort-regular branch from ad50a54 to 05e9a48 Compare October 21, 2025 23:06
@rodrigopedra
Copy link
Contributor

Shouldn't we drop SORT_REGULAR all together?

Based on this warning:

Warning Be careful when sorting arrays with mixed types values because sort() can produce unexpected results, if flags is SORT_REGULAR.

Reference: https://www.php.net/manual/en/function.sort.php#refsect1-function.sort-notes

This note is referenced from a comment within the bug report you mentioned:

php/doc-en#1463 (comment)

From my understanding of the PR's code, if any of the array's elements is not a scalar and not NULL, the SORT_REGULAR is kept as before.

And if all of them are scalars, the SORT_REGULAR is dropped, right?

But the note says:

sorting arrays with mixed types values ... can produce unexpected results

So why keep it when the collection has any non-scalar values?

@jmarble
Copy link
Author

jmarble commented Oct 21, 2025

I considered, but I don't know of an acceptable alternative for complex types. Honest truth is PHP needs to fix this in the source. But we have the benefit of a framework here :-)

I found this issue also exists for LazyCollection so I'm going to force push a fix for that and expand the test coverage for the scenario that led me here.

…correctly

Fixes bug where SORT_REGULAR incorrectly deduplicates numeric strings
when a '+' prefix is present. Collections like ['+19495551234',
'9495551234', '19495551234'] now correctly return all three items
instead of incorrectly removing the 11-digit format.

The fix checks if the collection contains complex types (arrays/objects)
that require SORT_REGULAR, or scalar values where default array_unique()
is faster and more correct.

This works around a documented PHP limitation with SORT_REGULAR where
numeric strings are compared as numbers in certain contexts, causing
unexpected deduplication behavior.

See: php/doc-en#1463
Related to: 48a53be
@jmarble jmarble force-pushed the fix-collection-unique-sort-regular branch from 05e9a48 to e9ae5e9 Compare October 22, 2025 00:29
@shaedrich
Copy link
Contributor

Would it make sense, to allow to pass the flags to Collection::unique() to overwrite the default behavior?

@jmarble
Copy link
Author

jmarble commented Oct 22, 2025

Would it make sense, to allow to pass the flags to Collection::unique() to overwrite the default behavior?

I would argue that it shouldn't be necessary to pass the flag -- wouldn't feel like the "Laravel way."

I think an argument can be made that Collection::unique() should be strict by default (I'm sure @nunomaduro would agree haha). Yet, Collection::unique() aligns more closely with the expected behavior of in_array(), including the optional strict parameter.

I have to admit that even though my fix improves performance and passes all tests -- it breaks from the accepted behavior of Collection::unique().

An alternative could be this, which achieves an overall improvement of ~3x for strict and keeps the SORT_REGULAR for default:

    public function unique($key = null, $strict = false)
    {
        if ($key === null && $strict === false) {
            return new static(array_unique($this->items, SORT_REGULAR));
        }elseif ($key === null && $strict === true) {
            $hasComplexItems = false;
            foreach ($this->items as $item) {
                if (! is_scalar($item) && $item !== null) {
                    $hasComplexItems = true;
                    break;
                }
            }
            if(! $hasComplexItems) {
                return new static(array_unique($this->items, SORT_STRING));
            }
        }

        $callback = $this->valueRetriever($key);
        $exists = [];

        return $this->reject(function ($item, $key) use ($callback, $strict, &$exists) {
            if (in_array($id = $callback($item, $key), $exists, $strict)) {
                return true;
            }
            $exists[] = $id;
        });
    }

@jmarble
Copy link
Author

jmarble commented Oct 22, 2025

I want to put this one to rest. I think SORT_REGULAR needs to be taken behind the woodshed. Anyone willing to help me find a performant solution for multidimensional arrays?

@taylorotwell
Copy link
Member

I doubt we change the behavior of unique() in any way on a patch release.

@jmarble jmarble deleted the fix-collection-unique-sort-regular branch October 22, 2025 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants