Implement bulk operations for stores with native batch APIs #79

Copilot · 2025-10-24T03:27:18Z

Overview

This PR implements efficient bulk operations (_get_managed_entries, _put_managed_entries, _delete_managed_entries) for stores that provide native batch APIs. Previously, these stores were falling back to the base implementation which simply loops over individual operations, resulting in significant performance overhead for bulk operations.

Changes

Stores with Bulk API Support

The following stores now leverage their native bulk APIs:

Redis Store

Read: Uses MGET command for efficient batch retrieval
Write: Uses Redis pipeline to batch SET/SETEX operations
Delete: Uses bulk DELETE command

Valkey Store

Read: Uses MGET command (Redis-compatible)
Write: Individual SET operations (no TTL-aware bulk API available)
Delete: Uses bulk DELETE command

MongoDB Store

Read: Uses find with $in operator for efficient bulk retrieval
Write: Uses bulk_write with UpdateOne operations for atomic batch updates
Delete: Uses delete_many with $in operator

Memcached Store

Read: Uses multi_get command for batch retrieval
Write: Individual SET operations (aiomcache doesn't provide multi_set)

Elasticsearch Store

Read: Uses mget API for efficient multi-document retrieval
Write: Uses bulk API for batch indexing operations
Delete: Uses bulk API for batch deletion with proper result counting

RocksDB Store

Write: Uses WriteBatch for atomic batch write operations
Delete: Uses WriteBatch for atomic batch delete operations
Read: Not implemented (rocksdict doesn't expose multi_get functionality)

DynamoDB Store

Uses the default BaseStore loop-based implementation for bulk operations (no custom batching implemented)

Performance Impact

These changes significantly improve performance for bulk operations by:

Reducing network round-trips (Redis, Valkey, Memcached, Elasticsearch)
Leveraging database-level optimizations (MongoDB, RocksDB)

Testing

All existing tests pass (1,459 passed, 143 skipped). The existing test suite already covers bulk operations through test_put_many_get_many, test_put_many_delete_many, and related performance tests, so no new tests were required.

Implementation Notes

Error handling and edge cases (empty lists, None values) are properly handled
Code follows the existing patterns and style guidelines
Stores without bulk APIs (Disk, Memory, DynamoDB, etc.) continue using the base implementation

Fixes #78

Original prompt

This section details on the original issue you should resolve

<issue_title>Implement _many methods on stores</issue_title>
<issue_description>Many of the stores use the fallback behavior for bulk operations. The fallback behavior calls put/get/etc in a loop.

Let's identify which stores have bulk apis available and leverage those for _many operations. If they dont have bulk APIs we can leave them alone. If they do have bulk APIs, we should use them, keeping in mind any client-specific bulk behavior like max number of entries, etc.

We do not need new tests for these as the existing tests cover.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #78

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Summary by CodeRabbit

Release Notes

New Features
- Batch operations support added across all key-value storage backends. Retrieve, insert, and delete multiple entries in optimized batch transactions for improved throughput and reduced latency in multi-entry operations.

coderabbitai · 2025-10-24T03:27:27Z

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This pull request implements batch operations across seven key-value store implementations: DynamoDB, Elasticsearch, Memcached, MongoDB, Redis, RocksDB, and Valkey. Each store gains three new methods—_get_managed_entries, _put_managed_entries, and _delete_managed_entries—that leverage store-specific bulk APIs for efficient multi-key operations, replacing fallback loop-based behavior.

Changes

Cohort / File(s)	Summary
DynamoDB Store `key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py`	Added batching constants (BATCH_GET_ITEM_LIMIT = 100, BATCH_WRITE_ITEM_LIMIT = 25); three batch methods using batch_get_item and batch_write_item with TTL support; Sequence import added.
Elasticsearch Store `key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py`	Three batch methods using mget for retrieval, bulk API for indexing and deletion, maintaining input key order; Sequence import added.
Memcached Store `key-value/key-value-aio/src/key_value/aio/stores/memcached/store.py`	Batch retrieval via multi_get with JSON decoding; batch insertion iterating over key-entry pairs with per-entry TTL; Sequence import added.
MongoDB Store `key-value/key-value-aio/src/key_value/aio/stores/mongodb/store.py`	Batch retrieval using find with `$in`, bulk upserts via bulk_write with UpdateOne, batch deletion using delete_many with `$in`; Sequence import added.
Redis Store `key-value/key-value-aio/src/key_value/aio/stores/redis/store.py`	Batch retrieval via mget; pipelined batch writes with per-entry TTL handling using setex or set; bulk deletion in single operation; Sequence import added.
RocksDB Store `key-value/key-value-aio/src/key_value/aio/stores/rocksdb/store.py`	Batch insertion and deletion using WriteBatch for atomic multi-key operations; Sequence import added.
Valkey Store `key-value/key-value-aio/src/key_value/aio/stores/valkey/store.py`	Batch retrieval via mget, pairwise iteration for batch writes with per-entry TTL, bulk deletion; Sequence import added.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client Code
    participant Store as Store Implementation
    participant Backend as Store Backend<br/>(DB/Cache/etc)
    
    rect rgb(200, 220, 255)
    Note over Client,Backend: Batch Get Operation
    Client->>Store: _get_managed_entries([key1, key2, key3])
    Store->>Backend: Bulk fetch (mget/find $in/batch_get_item)
    Backend-->>Store: [result1, result2, result3]
    Store->>Store: Decode/parse results<br/>maintain input order
    Store-->>Client: [ManagedEntry|None, ...]
    end
    
    rect rgb(220, 255, 220)
    Note over Client,Backend: Batch Put Operation
    Client->>Store: _put_managed_entries([key1, key2], [entry1, entry2])
    Store->>Store: Construct documents<br/>with TTL if present
    Store->>Backend: Bulk write (batch_write_item/bulk/pipeline/WriteBatch)
    Backend-->>Store: success
    Store-->>Client: void
    end
    
    rect rgb(255, 220, 220)
    Note over Client,Backend: Batch Delete Operation
    Client->>Store: _delete_managed_entries([key1, key2, key3])
    Store->>Backend: Bulk delete (batch_write_item/bulk/delete_many)
    Backend-->>Store: count_deleted
    Store-->>Client: 3 (count of deleted entries)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Multiple store implementations with heterogeneous bulk API patterns (batch_get_item, mget, find $in, WriteBatch, pipelines) require individual validation. Logic density varies by store; TTL handling, order preservation, and per-store batch limits demand careful verification across all seven files. Consistency in method signatures and return types across implementations aids review but does not reduce the need for store-specific reasoning.

Poem

🐇 Hop, hop—batch operations now fly,
No more loops that make performance sigh!
Seven stores leap with bulk APIs found,
Keys dance together, swift and sound,
TTLs expire with elegant grace,
Many-at-once sets the rapid pace! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 17.39% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "Implement bulk operations for stores with native batch APIs" clearly and accurately summarizes the main change in the pull request. It directly describes the primary objective of implementing three bulk methods (_get_managed_entries, _put_managed_entries, _delete_managed_entries) across multiple stores that support native batch APIs. The title is specific, concise, and avoids generic terms or noise, making it immediately understandable to a reviewer scanning the commit history.
Linked Issues Check	✅ Passed	The pull request successfully addresses the objectives from linked issue #78. The implementation identifies and adds bulk operation methods to all stores with native batch APIs: Redis (MGET, pipeline, bulk DELETE), Valkey (MGET, SET, bulk DELETE), DynamoDB (batch_get_item, batch_write_item with batching limits), MongoDB (find with $in, bulk_write, delete_many), Memcached (multi_get, individual SET), Elasticsearch (mget, bulk API), and RocksDB (WriteBatch). The changes respect client-specific bulk behavior such as DynamoDB's batching limits (100 for gets, 25 for writes), and existing tests confirm all functionality works correctly without requiring new test cases.
Out of Scope Changes Check	✅ Passed	All changes in the pull request are directly related to the objective of implementing bulk operations for stores with native batch APIs. The modifications include adding three bulk operation methods across multiple store implementations, introducing Sequence imports for typing, and defining batching constants for DynamoDB. There are no changes that diverge from the scope outlined in linked issue #78, and all additions serve the explicit purpose of leveraging backend-specific bulk APIs rather than falling back to per-item loops.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ched, Elasticsearch, and RocksDB stores Co-authored-by: strawgate <[email protected]>

…ilable) Co-authored-by: strawgate <[email protected]>

strawgate · 2025-10-24T13:24:29Z

@coderabbitai review

coderabbitai · 2025-10-24T13:24:53Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py (1)
101-116: Fix potential AttributeError when a pre-connected client is provided.

When client is passed, self._raw_client is never set; __aenter__ still reads it, which can raise AttributeError. Initialize it to None and guard with is not None.
@@
     ) -> None:
         """Initialize the DynamoDB store.
@@
-        self._table_name = table_name
+        self._table_name = table_name
+        self._raw_client = None  # always defined; only set when creating our own client
         if client:
             self._client = client
         else:
@@
     async def __aenter__(self) -> Self:
-        if self._raw_client:
+        if self._raw_client is not None:
             self._client = await self._raw_client.__aenter__()
         await super().__aenter__()
         return self
Also applies to: 118-123

🧹 Nitpick comments (7)

key-value/key-value-aio/src/key_value/aio/stores/valkey/store.py (1)
130-142: Clarify or implement the pipelining mentioned in the comment.

The comment on line 137 mentions "BaseClient supports pipelining through transaction," but the implementation awaits each set individually, resulting in separate round-trips per key. This doesn't leverage any pipelining optimization.

Consider one of the following:

Remove the misleading comment if pipelining isn't beneficial here due to per-key TTL requirements:
-        # Valkey's mset doesn't support per-key TTL, so we need to use a different approach
-        # We'll use a pipeline-like approach with individual set commands
-        # Note: BaseClient supports pipelining through transaction
+        # Valkey's mset doesn't support per-key TTL, so we use individual set commands
         for key, managed_entry in zip(keys, managed_entries, strict=True):
Implement actual pipelining if Glide's BaseClient supports batching multiple commands (e.g., via transactions or pipelines), which would reduce network round-trips even with individual SET commands:
# Example (syntax depends on Glide client API):
async with self._client.pipeline() as pipe:
    for key, managed_entry in zip(keys, managed_entries, strict=True):
        combo_key: str = compound_key(collection=collection, key=key)
        json_value: str = managed_entry.to_json()
        expiry: ExpirySet | None = ExpirySet(expiry_type=ExpiryType.SEC, value=int(managed_entry.ttl)) if managed_entry.ttl else None
        pipe.set(key=combo_key, value=json_value, expiry=expiry)
    await pipe.execute()
key-value/key-value-aio/src/key_value/aio/stores/redis/store.py (1)
134-155: Efficient bulk write implementation with Redis pipeline.

The pipeline-based approach correctly batches SET/SETEX operations, reducing network overhead. The strict=True argument in zip (line 144) provides good validation, and TTL handling is consistent with the single-entry method.

Optional: Consider type hint consistency for the keys parameter.

For consistency with the managed_entries parameter which uses Sequence[ManagedEntry], you could change keys: list[str] to keys: Sequence[str] (applies to all three bulk methods). This would make the API more flexible for callers.
-    async def _put_managed_entries(self, *, collection: str, keys: list[str], managed_entries: Sequence[ManagedEntry]) -> None:
+    async def _put_managed_entries(self, *, collection: str, keys: Sequence[str], managed_entries: Sequence[ManagedEntry]) -> None:
key-value/key-value-aio/src/key_value/aio/stores/rocksdb/store.py (1)
130-146: LGTM! Consider moving WriteBatch import to module level.

The batch write implementation is correct and leverages RocksDB's WriteBatch for atomic operations. The use of zip(..., strict=True) ensures keys and managed_entries sequences match in length, which is a good safety measure.

For consistency with the Rdict import pattern (lines 13-16), consider moving the WriteBatch import to the module-level try-except block:
 try:
-    from rocksdict import Options, Rdict
+    from rocksdict import Options, Rdict, WriteBatch
 except ImportError as e:
Then remove the local imports at lines 138-139 and 170.
key-value/key-value-aio/src/key_value/aio/stores/mongodb/store.py (3)
157-176: Add projection and de‑duplicate keys; consider chunking large $in queries.

Reduce payload and query size with projection and by de‑duplicating input keys while preserving output order.

Apply this minimal diff:
@@
-        collection = self._sanitize_collection_name(collection=collection)
+        collection = self._sanitize_collection_name(collection=collection)
@@
-        # Use find with $in operator to get multiple documents at once
-        cursor = self._collections_by_name[collection].find(filter={"key": {"$in": keys}})
+        # Use find with $in; de‑dupe keys and project only needed fields
+        query_keys = list(dict.fromkeys(keys))
+        cursor = self._collections_by_name[collection].find(
+            filter={"key": {"$in": query_keys}},
+            projection={"key": True, "value": True, "_id": False},
+        )
Optional: if keys can be very large, process in chunks (e.g., 1k–5k) to avoid hitting BSON/command size limits; merge results into docs_by_key.

205-236: Use unordered bulk writes to avoid abort‑on‑first‑error; move UpdateOne import to module scope.

Unordered writes keep going if one op fails; moving the import avoids per‑call overhead.

Apply this diff in the current block:
-        _ = await self._collections_by_name[collection].bulk_write(operations)
+        _ = await self._collections_by_name[collection].bulk_write(operations, ordered=False)
And adjust imports (outside this block) so UpdateOne is available at module scope:
@@
-try:
-    from pymongo import AsyncMongoClient
+try:
+    from pymongo import AsyncMongoClient, UpdateOne
@@
-        # Use bulk_write for efficient batch operations
-        from pymongo import UpdateOne
+        # Use bulk_write for efficient batch operations
Optional: use $setOnInsert for created_at to avoid overwriting an existing creation timestamp on upsert.

245-254: Guard against oversized $in by chunking deletes for very large key sets.

Single delete_many with a huge $in can hit command size limits; chunk and sum deleted_count.

Example change:
-        # Use delete_many with $in operator for efficient batch deletion
-        result: DeleteResult = await self._collections_by_name[collection].delete_many(filter={"key": {"$in": keys}})
-        return result.deleted_count
+        # Chunk to avoid command size limits on very large key lists
+        deleted = 0
+        CHUNK = 1000
+        for i in range(0, len(keys), CHUNK):
+            chunk = keys[i : i + CHUNK]
+            result: DeleteResult = await self._collections_by_name[collection].delete_many(filter={"key": {"$in": chunk}})
+            deleted += int(result.deleted_count)
+        return deleted
key-value/key-value-aio/src/key_value/aio/stores/memcached/store.py (1)
117-139: Consider using asyncio.gather for concurrent execution.

The comment on line 123 mentions thread-safety concerns, but this is misleading—async operations don't involve threading. Using asyncio.gather would enable concurrent execution of the individual set operations, improving performance over the current sequential loop:
@override
async def _put_managed_entries(self, *, collection: str, keys: list[str], managed_entries: Sequence[ManagedEntry]) -> None:
    if not keys:
        return

    async def put_single(key: str, managed_entry: ManagedEntry) -> None:
        combo_key: str = self.sanitize_key(compound_key(collection=collection, key=key))
        
        exptime: int
        if managed_entry.ttl is None:  # noqa: SIM108
            exptime = 0
        else:
            exptime = max(int(managed_entry.ttl), 1)
        
        json_value: str = managed_entry.to_json()
        
        _ = await self._client.set(
            key=combo_key.encode(encoding="utf-8"),
            value=json_value.encode(encoding="utf-8"),
            exptime=exptime,
        )
    
    await asyncio.gather(*(put_single(key, entry) for key, entry in zip(keys, managed_entries, strict=True)))
This would leverage network concurrency while maintaining the same semantics.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 116a0c7 and 1b32bcc.

📒 Files selected for processing (7)

key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py (5 hunks)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (4 hunks)
key-value/key-value-aio/src/key_value/aio/stores/memcached/store.py (3 hunks)
key-value/key-value-aio/src/key_value/aio/stores/mongodb/store.py (3 hunks)
key-value/key-value-aio/src/key_value/aio/stores/redis/store.py (3 hunks)
key-value/key-value-aio/src/key_value/aio/stores/rocksdb/store.py (3 hunks)
key-value/key-value-aio/src/key_value/aio/stores/valkey/store.py (3 hunks)

🧰 Additional context used

🪛 GitHub Actions: Run Tests

key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py

[error] 207-207: pyright type error: Argument of type "dict[str, dict[str, list[dict[str, dict[str, str]]]]]" cannot be assigned to parameter "RequestItems" of type "Mapping[str, KeysAndAttributesUnionTypeDef]" in function "batch_get_item".

🪛 GitHub Check: static_analysis (key-value/key-value-aio)

key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py

[failure] 207-207:
Argument of type "dict[str, dict[str, list[dict[str, dict[str, str]]]]]" cannot be assigned to parameter "RequestItems" of type "Mapping[str, KeysAndAttributesUnionTypeDef]" in function "batch_get_item"
  "dict[str, dict[str, list[dict[str, dict[str, str]]]]]" is not assignable to "Mapping[str, KeysAndAttributesUnionTypeDef]"
    Type parameter "_VT_co@Mapping" is covariant, but "dict[str, list[dict[str, dict[str, str]]]]" is not a subtype of "KeysAndAttributesUnionTypeDef"
      Type "dict[str, list[dict[str, dict[str, str]]]]" is not assignable to type "KeysAndAttributesUnionTypeDef"
        "dict[str, list[dict[str, dict[str, str]]]]" is not assignable to "KeysAndAttributesTypeDef"
        "dict[str, list[dict[str, dict[str, str]]]]" is not assignable to "KeysAndAttributesOutputTypeDef" (reportArgumentType)

[failure] 278-278:
Argument type is partially unknown
Argument corresponds to parameter "RequestItems" in function "batch_write_item"
Argument type is "dict[str, list[Unknown]]" (reportUnknownArgumentType)

[failure] 276-276:
Type of "request_items" is partially unknown
Type of "request_items" is "dict[str, list[Unknown]]" (reportUnknownVariableType)

[failure] 274-274:
Type of "append" is partially unknown
Type of "append" is "(object: Unknown, /) -> None" (reportUnknownMemberType)

[failure] 323-323:
Argument of type "dict[str, list[dict[str, dict[str, dict[str, dict[str, str]]]]]]" cannot be assigned to parameter "RequestItems" of type "Mapping[str, Sequence[WriteRequestUnionTypeDef]]" in function "batch_write_item"
  "dict[str, list[dict[str, dict[str, dict[str, dict[str, str]]]]]]" is not assignable to "Mapping[str, Sequence[WriteRequestUnionTypeDef]]"
    Type parameter "_VT_co@Mapping" is covariant, but "list[dict[str, dict[str, dict[str, dict[str, str]]]]]" is not a subtype of "Sequence[WriteRequestUnionTypeDef]"
      "list[dict[str, dict[str, dict[str, dict[str, str]]]]]" is not assignable to "Sequence[WriteRequestUnionTypeDef]"
        Type parameter "_T_co@Sequence" is covariant, but "dict[str, dict[str, dict[str, dict[str, str]]]]" is not a subtype of "WriteRequestUnionTypeDef"
          Type "dict[str, dict[str, dict[str, dict[str, str]]]]" is not assignable to type "WriteRequestUnionTypeDef" (reportArgumentType)

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py

[failure] 299-299:
Argument type is partially unknown
Argument corresponds to parameter "operations" in function "bulk"
Argument type is "list[Unknown]" (reportUnknownArgumentType)

[failure] 297-297:
Type of "append" is partially unknown
Type of "append" is "(object: Unknown, /) -> None" (reportUnknownMemberType)

[failure] 296-296:
Type of "append" is partially unknown
Type of "append" is "(object: Unknown, /) -> None" (reportUnknownMemberType)

[failure] 333-333:
Argument type is partially unknown
Argument corresponds to parameter "operations" in function "bulk"
Argument type is "list[Unknown]" (reportUnknownArgumentType)

[failure] 331-331:
Type of "append" is partially unknown
Type of "append" is "(object: Unknown, /) -> None" (reportUnknownMemberType)

🔇 Additional comments (18)

key-value/key-value-aio/src/key_value/aio/stores/valkey/store.py (3)

1-1: LGTM!

The import of Sequence is correctly used for the type annotation in _put_managed_entries.

95-112: LGTM!

The implementation correctly uses Valkey's mget for bulk reads, handles empty lists, maintains result order, and properly converts responses to ManagedEntry objects or None.

149-157: LGTM!

The implementation correctly uses Valkey's bulk delete operation, handles empty lists, and returns the count of deleted entries.

key-value/key-value-aio/src/key_value/aio/stores/redis/store.py (3)

1-1: LGTM: Import addition supports bulk operation typing.

The Sequence import is appropriately used for the managed_entries parameter type hint, providing flexibility for callers.

96-112: Excellent bulk read implementation using MGET.

The implementation correctly leverages Redis's native MGET command to fetch multiple keys in a single network round-trip. The empty list guard, compound key construction, and response handling (string conversion to ManagedEntry or None) are all consistent with the single-entry method.

163-172: Efficient bulk delete implementation using Redis DELETE.

The implementation correctly uses Redis's multi-key DELETE command, which atomically removes all specified keys and returns the count of deleted entries. The empty list guard and compound key construction are appropriate.

key-value/key-value-aio/src/key_value/aio/stores/rocksdb/store.py (2)

1-1: LGTM!

The Sequence import is correctly added to support the type hint for managed_entries in the new batch method.

162-186: LGTM! Batch deletes correctly implemented.

The implementation correctly uses WriteBatch for atomic batch deletions and is consistent with the single-key _delete_managed_entry method. The optimization to skip the write when no deletions occur is good.

Note on performance: The existence checks (lines 175-181) iterate individually through keys, which could be slower for large batches compared to a hypothetical multi_get operation. However, this is unavoidable given that rocksdict doesn't provide a multi_get API, as noted in the PR summary.

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (4)

1-1: LGTM: Import addition is correct.

The Sequence import is necessary for the type hint in _put_managed_entries and is correctly imported from collections.abc.

187-236: LGTM: Batch retrieval implementation is correct.

The _get_managed_entries method properly:

Uses Elasticsearch's mget API for efficient batch retrieval

Handles empty input lists

Preserves input order through dictionary lookup

Returns None for missing, malformed, or invalid documents

Follows the same data extraction and validation patterns as the single-entry method

269-300: LGTM: Batch indexing implementation is correct.

The _put_managed_entries method properly:

Uses Elasticsearch's bulk API for efficient batch indexing

Handles empty input lists

Uses strict=True in zip to ensure equal-length inputs

Constructs documents with the same structure and metadata as the single-entry method

Respects the refresh policy via _should_refresh_on_put

Note: The static analysis warnings about "partially unknown" types for the operations list are false positives due to the dynamic nature of Elasticsearch's bulk API.

316-346: LGTM: Batch deletion implementation is correct.

The _delete_managed_entries method properly:

Uses Elasticsearch's bulk API for efficient batch deletion

Handles empty input lists

Correctly counts successful deletions by checking result == "deleted"

Returns the count of successfully deleted items

Maintains consistency with the single-entry method by not applying refresh policy

Note: The static analysis warnings are false positives similar to those in _put_managed_entries.

key-value/key-value-aio/src/key_value/aio/stores/mongodb/store.py (2)

1-1: LGTM: correct import choice.

Using collections.abc.Sequence is the right runtime import in modern Python.

205-236: The project explicitly requires Python ≥3.10 — zip(strict=True) is valid and already used throughout the codebase.

All pyproject.toml files specify requires-python = ">=3.10", which fully supports zip(strict=True) (available since Python 3.10). This pattern is already consistently used across 12+ files in the repository, including base stores, wrappers, and test utilities. No fallback is required.

Likely an incorrect or invalid review comment.

key-value/key-value-aio/src/key_value/aio/stores/memcached/store.py (3)

2-2: LGTM!

The Sequence import is appropriate for the type hint in _put_managed_entries and follows Python best practices.

70-89: LGTM! Efficient batch read implementation.

The method properly leverages multi_get for efficient batch retrieval, handles empty input gracefully, and maintains consistency with the single-key _get_managed_entry method.

141-145: No action required—memcached store correctly omits _delete_managed_entries.

aiomcache does not provide a built-in "delete multiple keys" API. The PR objectives specify implementing _delete_managed_entries for stores with native batch APIs. Since aiomcache only supports individual delete(key) operations, the memcached store's omission of _delete_managed_entries is correct and aligns with the stated PR objectives.

key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py (1)

25-27: Batch limits look correct.

100 for BatchGetItem and 25 for BatchWriteItem match DynamoDB limits. No issues.

key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py

strawgate · 2025-10-25T02:57:35Z

@copilot it feels like the base store should perhaps have some bulk batch helpers, with each store just defining what batch size it can accept and the base store handling the batching?

Co-authored-by: strawgate <[email protected]>

Copilot · 2025-10-25T03:04:13Z

@copilot it feels like the base store should perhaps have some bulk batch helpers, with each store just defining what batch size it can accept and the base store handling the batching?

Good suggestion! I've added a _batch_items(items, batch_size) helper method to BaseStore that provides reusable batching logic. This reduces code duplication and makes it easier for stores to implement bulk operations with batch size limits.

The DynamoDB store has been refactored to use this helper (commit 808a899), replacing the manual for i in range(0, len(keys), BATCH_SIZE) loops with cleaner for batch in self._batch_items(keys, BATCH_SIZE) iterations.

Other stores can now easily use this helper if they need to batch their bulk operations.

strawgate · 2025-10-25T03:34:38Z

@copilot let's remove any batching implementations for now so no bulk calls for dynamo but let's proceed with the rest

Co-authored-by: strawgate <[email protected]>

Copilot · 2025-10-25T03:41:57Z

@copilot let's remove any batching implementations for now so no bulk calls for dynamo but let's proceed with the rest

Done! Removed DynamoDB bulk operations and the _batch_items helper from BaseStore (commit aba884f). DynamoDB now uses the default loop-based fallback implementation from BaseStore.

All other stores (Redis, Valkey, MongoDB, Memcached, Elasticsearch, RocksDB) still have their bulk operations intact and tests are passing.

strawgate · 2025-10-26T05:44:13Z

@claude there has been a big refactor can you do your list but skip mongo and elasticsearch for now as we're currently changing the stores for #104

claude · 2025-10-26T05:44:34Z

Claude finished @strawgate's task —— View job

Completed Code Review Improvements

I've successfully implemented the improvements suggested by the CodeRabbit review, skipping MongoDB and Elasticsearch as you requested.

Summary of Changes

Valkey Store (valkey/store.py)

✅ Removed misleading pipelining comment (lines 137-139)
✅ Updated keys parameter to use Sequence[str] for consistency

Redis Store (redis/store.py)

✅ Changed all keys: list[str] to keys: Sequence[str] for consistency with base class

RocksDB Store (rocksdb/store.py)

✅ Moved WriteBatch import to module level for consistency
✅ Updated keys parameter to use Sequence[str]

Memcached Store (memcached/store.py)

✅ Implemented asyncio.gather for concurrent execution instead of sequential loop
✅ Added asyncio import
✅ Updated keys parameter to use Sequence[str]

Quality Checks

✅ Linting: All ruff checks passed
✅ Committed: Changes committed with descriptive message
✅ Pushed: Changes pushed to branch copilot/optimize-many-methods-on-stores

What Was Changed

Type Consistency: All stores now use Sequence[str] for the keys parameter in bulk operations methods (_get_managed_entries, _put_managed_entries, _delete_managed_entries), matching the base class signature.
Code Clarity: Removed the misleading comment about pipelining in the Valkey store since the implementation doesn't actually use pipelining (each SET is awaited individually).
Import Organization: Moved WriteBatch import to module level in RocksDB store, consistent with how Rdict and Options are imported.
Performance: Memcached store now uses asyncio.gather() for concurrent execution of individual SET operations, improving performance for bulk writes by leveraging network concurrency.

These improvements enhance code quality, consistency, and performance without changing the external API or breaking existing tests.

| View commit

- Remove misleading pipelining comment in Valkey store - Move WriteBatch import to module level in RocksDB store for consistency - Use asyncio.gather for concurrent operations in Memcached store - Change keys parameter type from list[str] to Sequence[str] in all stores for consistency with base class These changes address CodeRabbit review feedback while maintaining functionality. Co-authored-by: William Easton <[email protected]>

strawgate · 2025-10-26T06:18:27Z

@claude static tests are failing can you fix please

strawgate · 2025-10-26T19:31:14Z

@claude add ttl to _put_managed_entries so that we dont have to inspect a managed entry to get the ttl info

claude · 2025-10-26T19:31:34Z

Claude finished @strawgate's task —— View job

Completed: Refactored `_put_managed_entries` to Accept TTL Parameter

Successfully refactored the bulk put operations to accept TTL and timestamp parameters instead of extracting them from managed entries.

Changes Made

Base Store (base.py)

Added ttl: float | None, created_at: datetime, and expires_at: datetime | None parameters to _put_managed_entries method signature
Updated put_many to pre-calculate timestamps once before calling _put_managed_entries, ensuring all entries get identical timestamps
Added comprehensive docstring explaining the new parameters

Redis Store (redis/store.py)

Updated _put_managed_entries to accept the new parameters
Uses ttl parameter directly instead of extracting from first entry
Maintains efficient pipeline-based bulk operations

MongoDB Store (mongodb/store.py)

Updated _put_managed_entries to accept the new parameters
Uses created_at and expires_at parameters directly for ISO format conversion
Maintains efficient bulk_write operations

Elasticsearch Store (elasticsearch/store.py)

Updated _put_managed_entries to accept the new parameters
Uses created_at and expires_at parameters directly for ISO format conversion
Maintains efficient bulk API operations
Fixed duplicate datetime import (removed from TYPE_CHECKING block)

RocksDB Store (rocksdb/store.py)

Updated _put_managed_entries signature to match base class
Parameters are not used in implementation (stores full JSON with metadata)
Maintains efficient WriteBatch operations

Benefits

No inspection needed: Stores no longer need to extract metadata from ManagedEntry objects
Consistent timestamps: All entries in a batch are guaranteed to have identical created_at and expires_at values
Clearer interface: Explicit parameters make the API more obvious
Better performance: Eliminates redundant timestamp extraction per entry

Quality Checks

✅ Linting passed (all Python checks)
✅ Changes committed and pushed
✅ Sync library will be regenerated by CI (requires Python 3.10)

View commit | View job

Instead of extracting TTL and timestamps from managed entries, these values are now passed as explicit parameters to _put_managed_entries. This avoids the need to inspect managed entries to extract metadata. Changes: - Base store: Added ttl, created_at, expires_at parameters to signature - Base store put_many: Pre-calculates timestamps once for all entries - Redis store: Uses ttl parameter instead of extracting from first entry - MongoDB store: Uses timestamp parameters instead of extracting from first entry - Elasticsearch store: Uses timestamp parameters instead of extracting from first entry - RocksDB store: Updated signature (stores full JSON, doesn't use params) Co-authored-by: William Easton <[email protected]>

…-methods-on-stores

sonarqubecloud · 2025-10-27T00:00:21Z

Quality Gate failed

Failed conditions
33.6% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Initial plan

c07ed4b

Copilot AI assigned Copilot and strawgate Oct 24, 2025

Copilot started work on behalf of strawgate October 24, 2025 03:27 View session

Copilot AI and others added 2 commits October 24, 2025 03:42

Implement bulk operations for Redis, Valkey, DynamoDB, MongoDB, Memca…

a47353f

…ched, Elasticsearch, and RocksDB stores Co-authored-by: strawgate <[email protected]>

Remove RocksDB _get_managed_entries implementation (multi_get not ava…

1b32bcc

…ilable) Co-authored-by: strawgate <[email protected]>

Copilot AI changed the title ~~[WIP] Optimize _many methods on stores to use bulk APIs~~ Implement bulk operations for stores with native batch APIs Oct 24, 2025

Copilot AI requested a review from strawgate October 24, 2025 03:50

Copilot finished work on behalf of strawgate October 24, 2025 03:50

strawgate marked this pull request as ready for review October 24, 2025 13:24

coderabbitai bot reviewed Oct 24, 2025

View reviewed changes

Copilot started work on behalf of strawgate October 25, 2025 02:58 View session

Add _batch_items helper to BaseStore and refactor DynamoDB to use it

808a899

Co-authored-by: strawgate <[email protected]>

Copilot finished work on behalf of strawgate October 25, 2025 03:05

Copilot started work on behalf of strawgate October 25, 2025 03:35 View session

Remove DynamoDB bulk operations and _batch_items helper from BaseStore

aba884f

Co-authored-by: strawgate <[email protected]>

Copilot finished work on behalf of strawgate October 25, 2025 03:42

strawgate added 2 commits October 24, 2025 22:56

Merge branch 'main' into copilot/optimize-many-methods-on-stores

8f7954c

Merge branch 'main' into copilot/optimize-many-methods-on-stores

9d467dd

More test updates

e4e9b03

github-actions bot and others added 12 commits October 26, 2025 19:36

Improvements to bulk requests in stores

d7da259

Merge remote-tracking branch 'origin/main' into copilot/optimize-many…

07e5d76

…-methods-on-stores

Lint and fix

6604651

Additional cleanup

17b4621

lint

1dac47f

More document conversion tests

3f0dabe

Merge remote-tracking branch 'origin/main' into copilot/optimize-many…

92936fa

…-methods-on-stores

Fixes for Windows Registry store tests

275c8d9

More Windows registry fixes

f5a0e41

make ttl expire test more reliable

52a8879

Fix windows registry utils

eeffb17

strawgate merged commit d58fe24 into main Oct 27, 2025
70 of 78 checks passed

strawgate deleted the copilot/optimize-many-methods-on-stores branch October 27, 2025 00:09

This was referenced Oct 27, 2025

feat: Switch MongoDB and Elasticsearch to native document storage #104

Closed

Add async example projects with comprehensive tests #117

Closed

In-depth review of AIO Stores and Wrappers #137

Closed

This was referenced Oct 27, 2025

fix: address critical and high-severity issues from code review #143

Merged

feat: add native storage mode for Elasticsearch #149

Merged

feat: add native storage mode for MongoDB #164

Merged

This was referenced Oct 29, 2025

Review DynamoStore #168

Open

Review Redis and Valkey store #169

Open

This was referenced Oct 30, 2025

Refactor ManagedEntry serialization with adapter pattern #174

Closed

Add AWS S3 support for py-key-value #178

Closed

feat: add key, collection, and version fields to serialized documents #204

Merged

coderabbitai bot mentioned this pull request Nov 11, 2025

Add AWS S3 Store #231

Open

Implement bulk operations for stores with native batch APIs #79

Implement bulk operations for stores with native batch APIs #79

Uh oh!

Conversation

Copilot AI commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Changes

Stores with Bulk API Support

Redis Store

Valkey Store

MongoDB Store

Memcached Store

Elasticsearch Store

RocksDB Store

DynamoDB Store

Performance Impact

Testing

Implementation Notes

Comments on the Issue (you are @copilot in this section)

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

strawgate commented Oct 24, 2025

Uh oh!

coderabbitai bot commented Oct 24, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

strawgate commented Oct 25, 2025

Uh oh!

Copilot AI commented Oct 25, 2025

Uh oh!

strawgate commented Oct 25, 2025

Uh oh!

Copilot AI commented Oct 25, 2025

Uh oh!

strawgate commented Oct 26, 2025

Uh oh!

claude bot commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Completed Code Review Improvements

Summary of Changes

Quality Checks

What Was Changed

Uh oh!

strawgate commented Oct 26, 2025

Uh oh!

strawgate commented Oct 26, 2025

Uh oh!

claude bot commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Completed: Refactored _put_managed_entries to Accept TTL Parameter

Changes Made

Benefits

Quality Checks

Uh oh!

sonarqubecloud bot commented Oct 27, 2025

Quality Gate failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Copilot AI commented Oct 24, 2025 •

edited

Loading

coderabbitai bot commented Oct 24, 2025 •

edited

Loading

claude bot commented Oct 26, 2025 •

edited

Loading

claude bot commented Oct 26, 2025 •

edited

Loading

Completed: Refactored `_put_managed_entries` to Accept TTL Parameter