-
Notifications
You must be signed in to change notification settings - Fork 47
Add ability to clear ALL data associated with an index #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b91e62e
ad9c20e
1002b85
c4d3527
835f5f9
cfccb37
d8ab61a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,7 +23,8 @@ | |
from redis.commands.search.indexDefinition import IndexDefinition | ||
|
||
from redisvl.index.storage import HashStorage, JsonStorage | ||
from redisvl.query.query import BaseQuery, CountQuery, FilterQuery | ||
from redisvl.query import BaseQuery, CountQuery, FilterQuery | ||
from redisvl.query.filter import FilterExpression | ||
from redisvl.redis.connection import ( | ||
RedisConnectionFactory, | ||
convert_index_info_to_schema, | ||
|
@@ -476,6 +477,26 @@ def delete(self, drop: bool = True): | |
except: | ||
logger.exception("Error while deleting index") | ||
|
||
def clear(self) -> int: | ||
"""Clear all keys in Redis associated with the index, leaving the index | ||
available and in-place for future insertions or updates. | ||
|
||
Returns: | ||
int: Count of records deleted from Redis. | ||
""" | ||
# Track deleted records | ||
total_records_deleted: int = 0 | ||
|
||
# Paginate using queries and delete in batches | ||
for batch in self.paginate( | ||
FilterQuery(FilterExpression("*"), return_fields=["id"]), page_size=500 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unlikely scenario, but what if I want to get the (some) data out before destroying the index, e.g. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with the notion that someone might want to be able to export data from an index. However, I think that should be it's own clear feature. For example, an I think |
||
): | ||
batch_keys = [record["id"] for record in batch] | ||
record_deleted = self._redis_client.delete(*batch_keys) # type: ignore | ||
total_records_deleted += record_deleted # type: ignore | ||
|
||
return total_records_deleted | ||
|
||
def load( | ||
self, | ||
data: Iterable[Any], | ||
|
@@ -894,6 +915,26 @@ async def delete(self, drop: bool = True): | |
logger.exception("Error while deleting index") | ||
raise | ||
|
||
async def clear(self) -> int: | ||
"""Clear all keys in Redis associated with the index, leaving the index | ||
available and in-place for future insertions or updates. | ||
|
||
Returns: | ||
int: Count of records deleted from Redis. | ||
""" | ||
# Track deleted records | ||
total_records_deleted: int = 0 | ||
|
||
# Paginate using queries and delete in batches | ||
async for batch in self.paginate( | ||
FilterQuery(FilterExpression("*"), return_fields=["id"]), page_size=500 | ||
): | ||
batch_keys = [record["id"] for record in batch] | ||
records_deleted = await self._redis_client.delete(*batch_keys) # type: ignore | ||
total_records_deleted += records_deleted # type: ignore | ||
|
||
return total_records_deleted | ||
|
||
async def load( | ||
self, | ||
data: Iterable[Any], | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,9 @@ | ||
from redisvl.query.query import CountQuery, FilterQuery, RangeQuery, VectorQuery | ||
from redisvl.query.query import ( | ||
BaseQuery, | ||
CountQuery, | ||
FilterQuery, | ||
RangeQuery, | ||
VectorQuery, | ||
) | ||
|
||
__all__ = ["VectorQuery", "FilterQuery", "RangeQuery", "CountQuery"] | ||
__all__ = ["BaseQuery", "VectorQuery", "FilterQuery", "RangeQuery", "CountQuery"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should let them control the batching parameter but defaulted to 500?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if someone's goal is to just clear all of the data from an index, we should implement the best practice under the hood and not make the user worry about it? But it certainly woudln't be hard to add an optional arg in the future if folks need it in the future!