-
Notifications
You must be signed in to change notification settings - Fork 3
vector search endpoint #1827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vector search endpoint #1827
Conversation
learning_resources_search/views.py
Outdated
) | ||
@extend_schema(summary="Vector Search") | ||
def get(self, request): | ||
request_data = LearningResourcesSearchRequestSerializer(data=request.GET) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have a separate serializer otherwise the openapi spec shows this endpoint supporting a bunch of options that are opensearch specific such as search mode or that are just not implemented yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can this be a V0 api for now? The V1 apis are supposed to be stable and have good documentation and be usable by an outside project. It can be moved to V1 once we build it out more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I moved this to the V0 api and created a separate serializer for vector results that only exposes the "q","limit" and "offset" params
learning_resources_search/api.py
Outdated
hits = [hit.metadata for hit in search_result] | ||
else: | ||
results = LearningResource.objects.for_search_serialization().all() | ||
hits = serialize_bulk_learning_resources([resource.id for resource in results]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are testing this, maybe it would be nicer to to return an abbreviated resource response so that it's easier to scan through the results to evaluate the quality of the response. Something like returning just the title, description and platform for each resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I have reduced it to return just the id, title, description, resource_type, platform and readable_id
@@ -469,6 +469,11 @@ class LearningResourcesSearchRequestSerializer(SearchRequestSerializer): | |||
) | |||
|
|||
|
|||
class LearningResourcesVectorSearchRequestSerializer(SearchRequestSerializer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't inherit from SearchRequestSerializer. http://api.open.odl.local:8063/api/v0/schema/redoc/#tag/learning_resources_vector_search/operation/learning_resources_vector_search_retrieve is still showing a bunch of filters that are not yet implemented and some that will not ever be implemented such as dev_mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* Release 0.24.3 * Release date for 0.24.3 * Server rendered search page results * v2 drawer certification updates (#1823) * update certification display in v2 drawer to match latest designs * don't show price info item if runs have differing data * MicroMasters not Micromasters * if there is no price for the certificate but it's indicated that one is included, display that * if resource is free, includes a certification but has no prices, still display the pill in the info item * generate migration for MicroMasters spelling change * fix certificate pill padding on mobile * Unit channel page and search prefetch * Featured list and testimonials only for unit channels * v2 learning resource drawer formats and location (#1826) * add format info item * display location if format is in_person * add tests * also show location for hybrid courses * LocalDate and NoSSR components to render localized dates only on client * Revert "LocalDate and NoSSR components to render localized dates only on client" This reverts commit b4ccd6d. * LocalDate and NoSSR components to render localized dates only on client (#1831) * LocalDate and NoSSR components to render localized dates only on client * Remove unnecessary React.Fragment * separate starts and as taught in, show anytime availability (#1828) * refactor starts / as taught in functionality to show on separate lines, show "anytime" in starts if availability is anytime * fix rebase mishap * Map address search params * Search params test * Update dependency pytest-cov to v6 (#1818) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Update dependency safety to v3 (#1819) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * URL search param validation anf transforms to align with course-search-utils * Update dependency django-anymail to v12 (#1815) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * vector search endpoint (#1827) * adding initial vector search view * adding working vector results endpoint * regenerate openapi spec * fixing format of returned results * adding test * patching qdrant client * moving to v0 api * switch to custom serializer for vector search * fix v0 url * using minimal serializer * returning minimal response for vector results * regenerate spec * adding some other useful bits to response * fixing response for empty query and adjusting test * regenerate spec * uninheriting from searchrequest serializer * updating oai spec * updating oai spec * Update dependency @mui/lab to v6.0.0-beta.15 (#1830) * Update dependency @mui/lab to v6.0.0-beta.15 * update lockfile --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: shankar ambady <[email protected]> * Update to use validators from course-search-utils --------- Co-authored-by: Doof <[email protected]> Co-authored-by: Carey P Gumaer <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Shankar Ambady <[email protected]> Co-authored-by: shankar ambady <[email protected]>
* adding initial vector search view * adding working vector results endpoint * regenerate openapi spec * fixing format of returned results * adding test * patching qdrant client * moving to v0 api * switch to custom serializer for vector search * fix v0 url * using minimal serializer * returning minimal response for vector results * regenerate spec * adding some other useful bits to response * fixing response for empty query and adjusting test * regenerate spec * uninheriting from searchrequest serializer * updating oai spec * updating oai spec
What are the relevant tickets?
Closes https://github.com/mitodl/hq/issues/6077
Description (What does it do?)
This PR adds a learning resource search endpoint (/api/v1/learning_resources_vector_search/) that retrieves results from qdrant using vector search instead of opensearch.
How can this be tested?
python manage.py generate_embeddings --all --skip-contentfiles
Additional Context