Skip to content

Further address performance regression in search #1749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 21, 2019

Conversation

sgrif
Copy link
Contributor

@sgrif sgrif commented May 20, 2019

Even after #1746, we're still seeing a performance issue with search in
production. Now it's limited to searches that are a single letter, or
short 2 letter words like 'do'. It's caused by any search that would
cause PG to warn that the query contains only stopwords. It appears the
code path taken when plainto_tsquery returns an empty query is
substantially slower than it would be otherwise, even if the query
contains stopwords.

The reason this has started causing problems now is that #1560 caused
the query to change from performing a nested loop join to a hash join.
Due to what appears to be a bug in PG, plainto_tsquery is getting
called once per row when a hash join is performed. When the query is
passed as the only argument, the function is declared as STABLE,
meaning that within a single statement it will always return the same
result for the same arguments, so PG should only be calling it once (or
at least only a handful of times).

There's a second form available where you explicitly pass the language
as an argument. This form is marked as IMMUTABLE, so the query planner
will just replace the call to the function with its results.

Unfortunately, PG is picky about how we pass the language. It doesn't
consider a cast from text to regconfig to be IMMUTABLE, only
STABLE (which is valid, since it's based on a pg_catalog lookup --
The fact that it accepts a string literal as IMMUTABLE actually seems
wrong). The actual value is the OID of the row in pg_ts_config, which
is not stable. Since regconfig('english'::text) is not considered
IMMUTABLE, we just need to embed it as a string literal instead.

sgrif added 3 commits May 20, 2019 14:32
Even after rust-lang#1746, we're still seeing a performance issue with search in
production. Now it's limited to searches that are a single letter, or
short 2 letter words like 'do'. It's caused by any search that would
cause PG to warn that the query contains only stopwords. It appears the
code path taken when `plainto_tsquery` returns an empty query is
substantially slower than it would be otherwise, even if the query
contains stopwords.

The reason this has started causing problems now is that rust-lang#1560 caused
the query to change from performing a nested loop join to a hash join.
Due to what appears to be a bug in PG, `plainto_tsquery` is getting
called once per row when a hash join is performed. When the query is
passed as the only argument, the function is declared as `STABLE`,
meaning that within a single statement it will always return the same
result for the same arguments, so PG should only be calling it once (or
at least only a handful of times).

There's a second form available where you explicitly pass the language
as an argument. This form is marked as `IMMUTABLE`, so the query planner
will just replace the call to the function with its results.

Unfortunately, PG is picky about how we pass the language. It doesn't
consider a cast from `text` to `regconfig` to be `IMMUTABLE`, only
`STABLE` (which is valid, since it's based on a `pg_catalog` lookup --
The fact that it accepts a string literal as `IMMUTABLE` actually seems
wrong). The actual value is the OID of the row in `pg_ts_config`, which
is *not* stable. Since `regconfig('english'::text)` is not considered
`IMMUTABLE`, we just need to embed it as a string literal instead.
@pietroalbini
Copy link
Member

Seems fine to me. Not really familiar with that part of PostgreSQL but the code makes sense.

@sgrif
Copy link
Contributor Author

sgrif commented May 21, 2019

@bors r=pietroalbini

@bors
Copy link
Contributor

bors commented May 21, 2019

📌 Commit 8f14232 has been approved by pietroalbini

bors added a commit that referenced this pull request May 21, 2019
Further address performance regression in search

Even after #1746, we're still seeing a performance issue with search in
production. Now it's limited to searches that are a single letter, or
short 2 letter words like 'do'. It's caused by any search that would
cause PG to warn that the query contains only stopwords. It appears the
code path taken when `plainto_tsquery` returns an empty query is
substantially slower than it would be otherwise, even if the query
contains stopwords.

The reason this has started causing problems now is that #1560 caused
the query to change from performing a nested loop join to a hash join.
Due to what appears to be a bug in PG, `plainto_tsquery` is getting
called once per row when a hash join is performed. When the query is
passed as the only argument, the function is declared as `STABLE`,
meaning that within a single statement it will always return the same
result for the same arguments, so PG should only be calling it once (or
at least only a handful of times).

There's a second form available where you explicitly pass the language
as an argument. This form is marked as `IMMUTABLE`, so the query planner
will just replace the call to the function with its results.

Unfortunately, PG is picky about how we pass the language. It doesn't
consider a cast from `text` to `regconfig` to be `IMMUTABLE`, only
`STABLE` (which is valid, since it's based on a `pg_catalog` lookup --
The fact that it accepts a string literal as `IMMUTABLE` actually seems
wrong). The actual value is the OID of the row in `pg_ts_config`, which
is *not* stable. Since `regconfig('english'::text)` is not considered
`IMMUTABLE`, we just need to embed it as a string literal instead.
@bors
Copy link
Contributor

bors commented May 21, 2019

⌛ Testing commit 8f14232 with merge 16e8d71...

@bors
Copy link
Contributor

bors commented May 21, 2019

☀️ Test successful - checks-travis
Approved by: pietroalbini
Pushing 16e8d71 to master...

@bors bors merged commit 8f14232 into rust-lang:master May 21, 2019
@sgrif sgrif deleted the sg-fix-slow-query-more branch August 14, 2019 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants