You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-43364][SS][DOCS] Add docs for RocksDB state store memory management
### What changes were proposed in this pull request?
Add docs for RocksDB state store memory management
### Why are the changes needed?
Docs only change
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
N/A
Closes#41042 from anishshri-db/task/SPARK-43364.
Authored-by: Anish Shrigondekar <[email protected]>
Signed-off-by: Jungtaek Lim <[email protected]>
<td>Total memory to be occupied by write buffers as a fraction of memory allocated across all RocksDB instances on a single node using maxMemoryUsageMB.</td>
<td>Total memory to be occupied by blocks in high priority pool as a fraction of memory allocated across all RocksDB instances on a single node using maxMemoryUsageMB.</td>
2381
+
<td>0.1</td>
2382
+
</tr>
2363
2383
</table>
2364
2384
2385
+
##### RocksDB State Store Memory Management
2386
+
RocksDB allocates memory for different objects such as memtables, block cache and filter/index blocks. If left unbounded, RocksDB memory usage across multiple instances could grow indefinitely and potentially cause OOM (out-of-memory) issues.
2387
+
RocksDB provides a way to limit the memory usage for all DB instances running on a single node by using the write buffer manager functionality.
2388
+
If you want to cap RocksDB memory usage in your Spark Structured Streaming deployment, this feature can be enabled by setting the `spark.sql.streaming.stateStore.rocksdb.boundedMemoryUsage` config to `true`.
2389
+
You can also determine the max allowed memory for RocksDB instances by setting the `spark.sql.streaming.stateStore.rocksdb.maxMemoryUsageMB` value to a static number or as a fraction of the physical memory available on the node.
2390
+
Limits for individual RocksDB instances can also be configured by setting `spark.sql.streaming.stateStore.rocksdb.writeBufferSizeMB` and `spark.sql.streaming.stateStore.rocksdb.maxWriteBufferNumber` to the required values. By default, RocksDB internal defaults are used for these settings.
2391
+
2365
2392
##### Performance-aspect considerations
2366
2393
2367
2394
1. You may want to disable the track of total number of rows to aim the better performance on RocksDB state store.
0 commit comments