-
Notifications
You must be signed in to change notification settings - Fork 643
Search should split crate names on Underscore when indexing #1549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@sgrif @FreeMasen Wouldn't be covered by #1560? |
#1560 doesn't really affect this, as it just affects what's included in the results at all --
Splitting by underscores also won't do anything here, PG already handles underscores the way you'd expect. Sorting by relevance uses the the PG full text search ranking functions with the weights we provided (name is A, keywords is B, description is C, name is D). We don't override the numerical weights used, so those will get weighted as 1.0, 0.4, 0.2, 0.1. You can read the details of how the matching works at https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-RANKING, but the short version is that it will rank based on how frequently the search term appears, and how far apart the appearance of those terms is. The issue here isn't that we're doing anything to make the Compare this to the first result, which has this in its index: We could play with the normalization options, but most of them just push
Sorting by "popularity" is a completely different topic, and much more difficult one to solve. I won't go into all the ideas/problems again here, but ultimately the "relevance" sorting seems to be working as intended. At the very least there's nothing to be gained by splitting on underscores, so I'm going to close this. |
Looking at the search result for
static
(https://crates.io/search?q=static), you can see that the term is not matchinglazy_static
's crate name, down-ranking the crate to the end of the page, although it's the most popular static-related crate bar far.Looking at the search result for
lazy_static
(https://crates.io/search?q=lazy_static), it looks like we already split on Underscore the search terms.I think we should do the same, splitting on Underscore, when indexing create names, so
static
query better matcheslazy_static
.WDYT?
The text was updated successfully, but these errors were encountered: