Skip to content

[SPARK-43364][SS][DOCS] Add docs for RocksDB state store memory management #41042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

anishshri-db
Copy link
Contributor

What changes were proposed in this pull request?

Add docs for RocksDB state store memory management

Why are the changes needed?

Docs only change

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

N/A

@github-actions github-actions bot added the DOCS label May 4, 2023
@anishshri-db
Copy link
Contributor Author

@HeartSaVioR @siying - please take a look. Thanks

@anishshri-db anishshri-db changed the title [SPARK-43364] Add docs for RocksDB state store memory management [SPARK-43364][SS] Add docs for RocksDB state store memory management May 4, 2023
<tr>
<td>spark.sql.streaming.stateStore.rocksdb.highPriorityPoolRatio</td>
<td>Total memory to be occupied by filter and index blocks as a fraction of memory allocated across all RocksDB instances on a single node.</td>
<td>0.1</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize the default is 0.1. If we don't have a special study, 0.5 is a good value to start with.

Copy link
Contributor Author

@anishshri-db anishshri-db May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is for the high pri pool blocks. The default for write buffer ratio is 0.5. Do we not need to keep some fraction for elements in the regular block cache too ?

@@ -2360,8 +2360,35 @@ Here are the configs regarding to RocksDB instance of the state store provider:
<td>The maximum number of MemTables in RocksDB, both active and immutable. Value of -1 means that RocksDB internal default values will be used</td>
<td>-1</td>
</tr>
<tr>
<td>spark.sql.streaming.stateStore.rocksdb.boundedMemoryUsage</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we can add them into SQLConf or StaticSQLConf so the docs can be generated, and to make it clear which value is the default.

For instance, we automatically generate the docs from SQL conf.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be done separately in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to defer this till we make RocksDB state store provider as default implementation.

@HyukjinKwon
Copy link
Member

HyukjinKwon commented May 8, 2023

cc @HeartSaVioR and @xuanyuanking FYI

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@HeartSaVioR
Copy link
Contributor

Thanks! Merging to master.

@HeartSaVioR HeartSaVioR changed the title [SPARK-43364][SS] Add docs for RocksDB state store memory management [SPARK-43364][SS][DOCS] Add docs for RocksDB state store memory management May 8, 2023
LuciferYang pushed a commit to LuciferYang/spark that referenced this pull request May 10, 2023
…ement

### What changes were proposed in this pull request?
Add docs for RocksDB state store memory management

### Why are the changes needed?
Docs only change

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
N/A

Closes apache#41042 from anishshri-db/task/SPARK-43364.

Authored-by: Anish Shrigondekar <[email protected]>
Signed-off-by: Jungtaek Lim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants