-
Notifications
You must be signed in to change notification settings - Fork 97
Adding bloom command meta data, bloom group and bloom data type documentaion #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
as well Signed-off-by: zackcam <[email protected]>
zuiderkwast
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting!
I skimmed through it very quickly. The documentation itself looks great AFAICT. I can do a more detailed review later.
The commands look very much like built-in commands. It's not mentioned anywhere that it's a separate module that users need to install. I think we should mentioned it on the bloom filters topic page with a link to the github repo. The BF command pages should link to that topic page, so the pages are all linked together.
To build man pages, the scripts in this repo need to be able to take multiple command JSON files. This needs to be added to the Makefile, the README and maybe the python scripts too. Please try to build the man pages as described in the README of this repo.
|
Many of the spellcheck errors can be fixed simply but writing the command names in backticks. Stuff in backticks are excluded from spellcheck IIRC. |
Yes, something like that would be good. In your screenshot it looks like the "Extensions" sub-heading is part of "Module Data Types" though, because of the levels of the headings used. If we do this, then "Module Data Types" should be a level-2 heading and "Bloom Filter" a level-3 heading under it. How about just mentioning the module within the description? Something like this? ## Bloom Filter
[Bloom filters](bloomfilters.md) provides a space efficient probabilistic data structure that allows checking if an element is a member of a set. False positives are possible, but it guarantees no false negatives.
+Bloom filters are provided by the module `valkey-bloom`.
For more information, see:
* [Overview of Bloom Filters](bloomfilters.md)
* [Bloom filter command reference](../commands/#bloom)
+* [The valkey-bloom module on GitHub](https://github.com/valkey-io/valkey-bloom/) |
|
@zuiderkwast I also wanted to get your input about how we should structure the modules to make it clear they aren't part of the core. The current structure is they are intermingled. I don't really have an opinion yet, but one alternative would be to at least separate them in a separate folder structure and clarify which module they are apart of. |
Are you talking about the URLs of the commands? I like that it's a flat structure, just like the commands are in a global flat namespace. The But we should definitely show it in some way. A line somewhere on each command page would be good. I hope we can be generate it in some way from an optional key in the command JSON file or something like that. |
I don't have a strong preference one way or the other about flat/nested, so sticking with flat is OK for me.
Yeah, I guess immediately let's make sure there is something in the JSON file. Maybe |
madolson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a super deep review. I think we should indicate more clearly that the commands are from a module and not part of the core. That can maybe from the json docs only though.
commands/bf.add.md
Outdated
| * key (required) - A Valkey key of Bloom data type | ||
| * item (required) - Item to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * key (required) - A Valkey key of Bloom data type | |
| * item (required) - Item to add |
We typically omit this, since the usage would be included at the top which will indicate if something is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah makes sense I removed all these from the bloom commands and if I think the arguments needed explained updated the heading name
commands/bf.add.md
Outdated
| @@ -0,0 +1,12 @@ | |||
| Adds an item to a bloom filter, if the specified filter does not exist creates a default bloom filter with that name. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Adds an item to a bloom filter, if the specified filter does not exist creates a default bloom filter with that name. | |
| Adds an item to a bloom filter, if the specified bloom filter does not exist creates a bloom filter with default configurations with that name. | |
| If you want to create a bloom filter with non-standard options, use the `BF.INSERT` command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated and made it less wordy as well by removing 'specified' from the description
commands/bf.exists.md
Outdated
| @@ -0,0 +1,16 @@ | |||
| Determines if a specified item has been added to the specified bloom filter. | |||
| Syntax | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Syntax |
commands/bf.info.md
Outdated
| @@ -0,0 +1,35 @@ | |||
| Returns information about a bloomfilter | |||
|
|
|||
| ## Arguments | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need to be kept because they include the info data, but I would change this to be about info fields or something.
commands/bf.info.md
Outdated
| ## Arguments | ||
| * key (required) - A valkey key of bloom data type | ||
| * CAPACITY (optional) - Returns the number of unique items that would need to be added before scaling would happen | ||
| * SIZE (optional) - Returns the memory size which is the number of bytes allocated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * SIZE (optional) - Returns the memory size which is the number of bytes allocated | |
| * SIZE (optional) - Returns the number of bytes allocated |
Why waste time say lot word when few word do trick?
topics/data-types.md
Outdated
|
|
||
| ## Bloom Filter | ||
|
|
||
| [Bloom filters](bloomfilters.md) provides a space efficient probabilistic data structure that allows checking if an element is a member of a set. False positives are possible, but it guarantees no false negatives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would translate this to english with an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to make this more understandable but I think potentially having what I use in the exists and mexists commands could also work if the new version still isn't great
topics/bloomfilters.md
Outdated
|
|
||
| Bloom filters are a space efficient probabilistic data structure that allows checking whether an element is member of a set. False positives are possible, but it guarantees no false negatives. | ||
|
|
||
| ## Bloom commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are other examples include the "basic commands" up front, and then the more sophisticated commands later. I think we should do the same.
topics/bloomfilters.md
Outdated
|
|
||
| **Financial fraud detection** | ||
|
|
||
| Bloom filters can help answer the question "Has the user paid from this location before?", which can then give insights if there has been suspicious activity in shopping habits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a real use case? The false positive here is not idea, since it might make it seem like a transaction is legitimate when it is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this use case to be more about card fraud instead of location based checking
topics/bloomfilters.md
Outdated
|
|
||
| Bloom filters can help answer the question "Has the user paid from this location before?", which can then give insights if there has been suspicious activity in shopping habits. | ||
|
|
||
| For the above each user would have a Bloom filter which is then checked for every transaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might just merge this into the previous paragraph.
topics/bloomfilters.md
Outdated
|
|
||
| **Check if URL's are malicious** | ||
|
|
||
| Bloom filters can answer the question is a URL malicious. Any URL inputted would be checked against a malicious URL bloom filter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Bloom filters can answer the question is a URL malicious. Any URL inputted would be checked against a malicious URL bloom filter. | |
| Bloom filters can answer the question "is a URL malicious?". Any URL inputted would be checked against a malicious URL bloom filter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a complete review.
We need to think about what we want regarding
- How to show which module a command belongs to and how to store this in the JSON file(s).
- What to show in the
Sincefields. If we'll release some valkey-with-modules bundle, then the version number should probably follow valkey's versioning(?).
I think for now we should show the independent modules version number, since we got alignment on that. Internally at AWS we are planning on reviving valkey-io/valkey#408 and posting some suggestions. Once that has alignment, we can maybe add more information about where it's available (i.e. Valkey core since 10.0, valkey-bloom since 1.0) |
…to generate bloom man pages Signed-off-by: zackcam <[email protected]>
commands/bf.add.md
Outdated
| @@ -0,0 +1,12 @@ | |||
| Adds an item to a bloom filter, if the specified bloom filter does not exist creates a bloom filter with default configurations with that name. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Adds an item to a bloom filter, if the specified bloom filter does not exist creates a bloom filter with default configurations with that name. | |
| Adds a single item to a bloom filter. If the specified bloom filter does not exist, a bloom filter is created with the provided name with default properties. |
commands/bf.add.md
Outdated
| @@ -0,0 +1,12 @@ | |||
| Adds an item to a bloom filter, if the specified bloom filter does not exist creates a bloom filter with default configurations with that name. | |||
|
|
|||
| If you want to create a bloom filter with non-standard options, use the `BF.INSERT` or `BF.RESERVE` command. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By non-standard options, you mean the non default properties. Right?
| If you want to create a bloom filter with non-standard options, use the `BF.INSERT` or `BF.RESERVE` command. | |
| To add multiple items to a bloom filter, you can use the BF.MADD or BF.INSERT commands. | |
| If you want to create a bloom filter with non-default properties, use the `BF.INSERT` or `BF.RESERVE` command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah non standard meant non default, but agree makes more sense to say non default and that keeps it consistent
commands/bf.card.md
Outdated
| @@ -0,0 +1,12 @@ | |||
| Gets the cardinality of a Bloom filter - number of items that have been successfully added to a Bloom filter. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Gets the cardinality of a Bloom filter - number of items that have been successfully added to a Bloom filter. | |
| Returns the cardinality of a Bloom filter which is the number of items that have been successfully added to it. |
commands/bf.card.md
Outdated
| 1 | ||
| 127.0.0.1:6379> BF.CARD key | ||
| 1 | ||
| 127.0.0.1:6379> BF.CARD missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 127.0.0.1:6379> BF.CARD missing | |
| 127.0.0.1:6379> BF.CARD nonexistentkey |
commands/bf.exists.md
Outdated
| @@ -0,0 +1,19 @@ | |||
| Determines if an item has been added to the bloom filter. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Determines if an item has been added to the bloom filter. | |
| Determines if an item has been added to the bloom filter previously. |
|
I only reviewed the Command Documentation. I will need to review the remaining sections next |
Signed-off-by: zackcam <[email protected]>
section to bloomfilter topic, cleaned up other bloomfilter topic sections
Making changes based on review comments Co-authored-by: KarthikSubbarao <[email protected]> Signed-off-by: zackcam <[email protected]>
| @@ -0,0 +1,14 @@ | |||
| Adds a single item to a bloom filter. If the specified bloom filter does not exist, a bloom filter is created with the provided name with default properties. | |||
|
|
|||
| To add multiple items to a bloom filter, you can use the `BF.MADD` or `BF.INSERT` commands. | |||
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is more grammatically correct to have the s as it it showing there are multiple commands that can do this not just one
topics/bloomfilters.md
Outdated
|
|
||
| We have implemented `VALIDATESCALETO` as an optional arg of `BF.INSERT` to help determine whether the bloom filter can scale out to the reach the specified capacity without hitting either limits mentioned above. It will reject the command otherwise. | ||
|
|
||
| As seen below, when trying to create a bloom filter with a capacity that cannot be achieved through scale outs (given the memory limits), the command is rejected. However, if the capacity can be achieved through scale out (even with the limits) then the creation of the bloom filter will succeed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| As seen below, when trying to create a bloom filter with a capacity that cannot be achieved through scale outs (given the memory limits), the command is rejected. However, if the capacity can be achieved through scale out (even with the limits) then the creation of the bloom filter will succeed. | |
| As seen below, when trying to create a bloom filter with a capacity that cannot be achieved through scale outs (given the memory limits), the command is rejected. However, if the capacity can be achieved through scale out (even with the limits), the creation of the bloom filter will succeed. |
topics/bloomfilters.md
Outdated
|
|
||
| ## Performance | ||
|
|
||
| The bloom commands which involve adding items or checking the existence of items have a time complexity of O(n * k) where n is the number of hash functions used by the bloom filter and k is the number of elements being inserted. This means that both BF.ADD and BF.EXISTS are both O(n) as they only operate on one item. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The bloom commands which involve adding items or checking the existence of items have a time complexity of O(n * k) where n is the number of hash functions used by the bloom filter and k is the number of elements being inserted. This means that both BF.ADD and BF.EXISTS are both O(n) as they only operate on one item. | |
| The bloom commands which involve adding items or checking the existence of items have a time complexity of O(N * K) where N is the number of hash functions used by the bloom filter and K is the number of elements being inserted. This means that both BF.ADD and BF.EXISTS are both O(N) as they only operate on one item. |
commands/bf.insert.md
Outdated
| * EXPANSION *expansion* - This option will specify the bloom filter as scaling and controls the size of the sub filter that will be created upon scale out / expansion of the bloom filter. | ||
| * NOCREATE - Will not create the bloom filter and add items if the filter does not exist already. | ||
| * TIGHTENING *tightening_ratio* - The tightening ratio for the bloom filter. | ||
| * SEED *seed* - The seed the hash functions will use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * SEED *seed* - The seed the hash functions will use. | |
| * SEED *seed* - The 32 byte seed the bloom filter's hash functions will use. |
topics/bloomfilters.md
Outdated
| [] | ||
| ``` | ||
|
|
||
| We can use the `BF.INFO` command's `MAXSCALEDCAPACITY` field to find out the maximum capacity that the scalable bloom filter can expand to hold. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| We can use the `BF.INFO` command's `MAXSCALEDCAPACITY` field to find out the maximum capacity that the scalable bloom filter can expand to hold. | |
| The `BF.INFO` command's `MAXSCALEDCAPACITY` field can be used to find out the maximum capacity that the scalable bloom filter can expand to hold. |
topics/bloomfilters.md
Outdated
|
|
||
| Bloom filters can be used to answer the question, "Has this card been flagged as stolen?". To do this, use a bloom filter that contains cards reported as stolen. When a card is used, check whether it is present in the bloom filter. If the card is not found, it means it is not marked as stolen. If the card is present in the filter, a check can be made against the main database, or the purchase can be denied. | ||
|
|
||
| ### Ad placement / Deduplication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming usecase
| ### Ad placement / Deduplication | |
| ### Advertisement / Campaign placement and deduplication |
Also, let's make this the first section in the list of use cases. Fraud detection can be second.
topics/bloomfilters.md
Outdated
| Bloom filters can help advertisers answer the following questions: | ||
| * Has the user already seen this ad? | ||
| * Has the user already purchased this product? | ||
|
|
||
| For each user, use a Bloom filter to store all the products they have purchased. The recommendation engine can then suggest a new product and check if it is present in the user's Bloom filter. | ||
|
|
||
| * If the product is not in the filter, the ad is shown to the user, and the product is added to the filter. | ||
| * If the product is already in the filter, it means the ad has already been shown to the user and the recommendation engine finds a different ad to show. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Bloom filters can help advertisers answer the following questions: | |
| * Has the user already seen this ad? | |
| * Has the user already purchased this product? | |
| For each user, use a Bloom filter to store all the products they have purchased. The recommendation engine can then suggest a new product and check if it is present in the user's Bloom filter. | |
| * If the product is not in the filter, the ad is shown to the user, and the product is added to the filter. | |
| * If the product is already in the filter, it means the ad has already been shown to the user and the recommendation engine finds a different ad to show. | |
| Bloom filters can help e-commerce sites, streaming services, advertising networks, or marketing platforms answer the following questions: | |
| * Has an advertisement already been shown to a user? | |
| * Has a promotional email or notification already been sent to a user? | |
| * Has a product already been purchased by a user? | |
| Example: For each user, use a Bloom filter to store all the products they have purchased. The recommendation engine can then suggest a new product and check if it is present in the user's Bloom filter. | |
| * If the product is not in the filter, the ad is shown to the user, and the product is added to the filter. | |
| * If the product is already in the filter, it means the ad has already been shown to the user and the recommendation engine finds a different ad to show. |
topics/bloomfilters.md
Outdated
| * If the product is not in the filter, the ad is shown to the user, and the product is added to the filter. | ||
| * If the product is already in the filter, it means the ad has already been shown to the user and the recommendation engine finds a different ad to show. | ||
|
|
||
| ### Check if URL's are malicious |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above this, we could add a use case for "Filtering out Spam/Harmful Content". Or you can combine both by making this title generic and listing both usecases below
topics/bloomfilters.md
Outdated
|
|
||
| The bloom commands which involve adding items or checking the existence of items have a time complexity of O(n * k) where n is the number of hash functions used by the bloom filter and k is the number of elements being inserted. This means that both BF.ADD and BF.EXISTS are both O(n) as they only operate on one item. | ||
|
|
||
| Since performance relies on the number of hash functions, choosing the correct capacity and expansion rate can be important. In case of scalable bloom filters, with every scale out, we increase the number of checks (using hash functions of each sub filter) performed during any add / exists operation. For this reason, it is recommended that users choose a capacity after evaluating the use case / workload to avoid several scale outs and reduce the number of checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Since performance relies on the number of hash functions, choosing the correct capacity and expansion rate can be important. In case of scalable bloom filters, with every scale out, we increase the number of checks (using hash functions of each sub filter) performed during any add / exists operation. For this reason, it is recommended that users choose a capacity after evaluating the use case / workload to avoid several scale outs and reduce the number of checks. | |
| In case of scalable bloom filters, with every scale out, we increase the number of checks (using hash functions of each sub filter) performed during any add / exists operation. For this reason, it is recommended that users choose a capacity and expansion rate after evaluating the use case / workload to avoid several scale outs and reduce the number of checks. |
432ed3d to
4be2099
Compare
KarthikSubbarao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes and details @zackcam .
Approved
Signed-off-by: zackcam <[email protected]>
hpatro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit picks.
Co-authored-by: Harkrishn Patro <[email protected]> Signed-off-by: zackcam <[email protected]>
|
I don't have write permissions to this repository, @zuiderkwast or @madolson could one of you help review and close this out? Once this is in, we could get the website PR closed and verify the changes and do the same activity for the JSON changes. Thanks. |
commands/bf.card.md
Outdated
|
|
||
| ``` | ||
| 127.0.0.1:6379> BF.ADD key val | ||
| 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a string or an integer response? This implies simple string, which seems odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an integer copying the response I got over now
commands/bf.card.md
Outdated
| 1 | ||
| 127.0.0.1:6379> BF.CARD nonexistentkey | ||
| 0 | ||
| ``` No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a trailing new line
| ``` | ||
| ``` | ||
| 127.0.0.1:6379> BF.INSERT key NOCREATE ITEMS item1 item2 | ||
| (error) ERR not found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a very good error, is it too late to make it better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error message is from existing API and error messages from existing client libraries that support bloom filters
We followed the existing error messages to be API compatible with the bloom filter commands of existing client libraries
topics/bloomfilters.md
Outdated
|
|
||
| These are the default bloom properties along with the commands and configs which allow customizing. | ||
|
|
||
| <table width="100%" border="1" style="border-collapse: collapse; border: 1px solid black" cellpadding="8"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will be wonky when ever we end up putting it in a man page. I think we should keep it as vanilla markdown.
topics/bloomfilters.md
Outdated
|
|
||
| * `bf_bloom_defrag_misses`: Total number of defrag misses that have occurred on bloom filters. | ||
|
|
||
| ## Limits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are more of configs as opposed to limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this document, since we already do list the configs in a section above, how about naming this section as "Large Bloom Filters"? Or "Handling Large Bloom Filters"?
topics/bloomfilters.md
Outdated
| When a bloom filter scales out, a new sub filter is added. The limit on the number of sub filters depends on the false positive rate and tightening ratio. Each sub filter has a stricter false positive, and this is controlled by the tightening ratio. If a command attempting a scale out results in the sub filter reaching a false positive of 0, the command is rejected. | ||
|
|
||
|
|
||
| We have implemented `VALIDATESCALETO` as an optional arg of `BF.INSERT` to help determine whether the bloom filter can scale out to the reach the specified capacity without hitting either limits mentioned above. It will reject the command otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| We have implemented `VALIDATESCALETO` as an optional arg of `BF.INSERT` to help determine whether the bloom filter can scale out to the reach the specified capacity without hitting either limits mentioned above. It will reject the command otherwise. | |
| You can use `VALIDATESCALETO` as an optional arg of `BF.INSERT` to help determine whether the bloom filter can scale out to the reach the specified capacity without hitting either limits mentioned above. It will reject the command otherwise. |
topics/bloomfilters.md
Outdated
|
|
||
| ## Handling Large Bloom Filters | ||
|
|
||
| There are two limits a bloom filter faces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| There are two limits a bloom filter faces. | |
| There are two notable validations bloom filters face. |
topics/bloomfilters.md
Outdated
|
|
||
| 1. Memory Usage Limit: | ||
|
|
||
| The memory usage limit per bloom filter by default is defined by the `BF.BLOOM-MEMORY-USAGE-LIMIT` module configuration which has a default value of 128 MB. If a command results in a creation / scale out causing the overall memory usage to exceed this limit, the command is rejected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The memory usage limit per bloom filter by default is defined by the `BF.BLOOM-MEMORY-USAGE-LIMIT` module configuration which has a default value of 128 MB. If a command results in a creation / scale out causing the overall memory usage to exceed this limit, the command is rejected. | |
| The memory usage limit per bloom filter by default is defined by the `BF.BLOOM-MEMORY-USAGE-LIMIT` module configuration which has a default value of 128 MB. If a command results in a creation / scale out causing the overall memory usage to exceed this limit, the command is rejected. This config is modifiable and can be increased as needed. |
topics/bloomfilters.md
Outdated
|
|
||
| There are two limits a bloom filter faces. | ||
|
|
||
| 1. Memory Usage Limit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 1. Memory Usage Limit: | |
| 1. Memory Usage: |
Signed-off-by: zackcam <[email protected]>
…yed on the Valkey website (#212) Related PR's Bloom repo json command files: valkey-io/valkey-bloom#47 Valkey-doc repo: valkey-io/valkey-doc#233 ### Description This PR will allow set the framework so that modules can have their commands displayed on the valkey website (By adding the bloom module commands in a way that can be easily expanded on). I have tried to make this future proof by using a for loop on the `commands.html` page which can be expanded by just adding any new folders we want to pull commands from. For the `command-page.html` I have used an array to hold the data from the multiple folders with commands and then get the first occurrence that isn't empty (i.e the command belongs to that folder). This will keep ability so that if the command doesn't exist we still have the same fallback. Updated the `init-commands.sh` to create a link for the bloom commands as well and take in the bloom repository. I have updated the README as well to include the new repo that will be needed for the commands and the information change associated with now expecting commands from the bloom repo. Lastly updated the github workflow as well to also now build and take in the bloom repo **For screenshots of the new documentation the two pr's above (valkey-io/valkey-doc#233 and valkey-io/valkey-bloom#47) have screenshots of all sections being added** ### Check List - [x] Commits are signed per the DCO using `--signoff` By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License. Signed-off-by: zackcam <[email protected]>
Signed-off-by: zackcam <[email protected]>
| * TIGHTENING *tightening_ratio* - The tightening ratio for the bloom filter. | ||
| * SEED *seed* - The 32 byte seed the bloom filter's hash functions will use. | ||
| * NONSCALING - This option will configure the bloom filter as non scaling; it cannot expand / scale beyond its specified capacity. | ||
| * VALIDATESCALETO *validatescaleto* - Validates if the filter can scale out and reach to this capacity based on limits and if not, return an error without creating the bloom filter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why you had to add validatescaleto to the spellcheck wordlist.
The idea is that keywords like this should be put within backticks. Then it doesn't need to be in the wordlist.
It's possible to combine italics + backticks if needed, for example:
`VALIDATESCALETO` *`validatescaleto`*
Rendered as
VALIDATESCALETO validatescaleto




This is one of three PR's that will be done for adding information about the bloom module to the Valkey website:
Bloom repo json command files: valkey-io/valkey-bloom#47
valkey-io.github.io: valkey-io/valkey-io.github.io#212
This PR has three main changes
Adding the bloom command group

Adding bloom command metadata files (Example for bf.add below)
3. Adding bloom data type documents