-
Notifications
You must be signed in to change notification settings - Fork 101
[FLINK-37781][docs] Add elasticsearch lookup join document #125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
MOBIN-F
wants to merge
1
commit into
apache:main
Choose a base branch
from
MOBIN-F:lookupJoin-Doc
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -244,6 +244,58 @@ Connector Options | |
By default uses built-in <code>'json'</code> format. Please refer to <a href="{{< ref "docs/connectors/table/formats/overview" >}}">JSON Format</a> page for more details. | ||
</td> | ||
</tr> | ||
<tr> | ||
<td><h5>lookup.cache</h5></td> | ||
<td>optional</td> | ||
<td>yes</td> | ||
<td style="word-wrap: break-word;">NONE</td> | ||
<td><p>Enum</p>Possible values: NONE, PARTIAL</td> | ||
<td>The cache strategy for the lookup table. Currently supports NONE (no caching) and PARTIAL (caching entries on lookup operation in external database).</td> | ||
</tr> | ||
<tr> | ||
<td><h5>lookup.partial-cache.max-rows</h5></td> | ||
<td>optional</td> | ||
<td>yes</td> | ||
<td style="word-wrap: break-word;">(none)</td> | ||
<td>Long</td> | ||
<td>The max number of rows of lookup cache, over this value, the oldest rows will be expired. | ||
"lookup.cache" must be set to "PARTIAL" to use this option.</td> | ||
</tr> | ||
<tr> | ||
<td><h5>lookup.partial-cache.expire-after-write</h5></td> | ||
<td>optional</td> | ||
<td>yes</td> | ||
<td style="word-wrap: break-word;">(none)</td> | ||
<td>Duration</td> | ||
<td>The max time to live for each rows in lookup cache after writing into the cache | ||
"lookup.cache" must be set to "PARTIAL" to use this option. </td> | ||
</tr> | ||
<tr> | ||
<td><h5>lookup.partial-cache.expire-after-access</h5></td> | ||
<td>optional</td> | ||
<td>yes</td> | ||
<td style="word-wrap: break-word;">(none)</td> | ||
<td>Duration</td> | ||
<td>The max time to live for each rows in lookup cache after accessing the entry in the cache. | ||
"lookup.cache" must be set to "PARTIAL" to use this option. </td> | ||
</tr> | ||
<tr> | ||
<td><h5>lookup.partial-cache.caching-missing-key</h5></td> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should be 'lookup.partial-cache.cache-missing-key' |
||
<td>optional</td> | ||
<td>yes</td> | ||
<td style="word-wrap: break-word;">true</td> | ||
<td>Boolean</td> | ||
<td>Whether to store an empty value into the cache if the lookup key doesn't match any rows in the table. | ||
"lookup.cache" must be set to "PARTIAL" to use this option.</td> | ||
</tr> | ||
<tr> | ||
<td><h5>lookup.max-retries</h5></td> | ||
<td>optional</td> | ||
<td>yes</td> | ||
<td style="word-wrap: break-word;">3</td> | ||
<td>Integer</td> | ||
<td>The max retry times if lookup database failed.</td> | ||
</tr> | ||
</tbody> | ||
</table> | ||
|
||
|
@@ -280,6 +332,19 @@ When formatting the system time as a string, the time zone configured in the ses | |
**NOTE:** When using the dynamic index generated by the current system time, for changelog stream, there is no guarantee that the records with the same primary key can generate the same index name. | ||
Therefore, the dynamic index based on the system time can only support append only stream. | ||
|
||
### Lookup Cache | ||
|
||
Elasticsearch connector can be used in temporal join as a lookup source (aka. dimension table). Currently, only sync lookup mode is supported. | ||
|
||
By default, lookup cache is not enabled. You can enable it by setting `lookup.cache` to `PARTIAL`. | ||
|
||
The lookup cache is used to improve performance of temporal join the Elasticsearch connector. By default, lookup cache is not enabled, so all the requests are sent to external database. | ||
When lookup cache is enabled, each process (i.e. TaskManager) will hold a cache. Flink will lookup the cache first, and only send requests to external database when cache missing, and update cache with the rows returned. | ||
The oldest rows in cache will be expired when the cache hit to the max cached rows `lookup.partial-cache.max-rows` or when the row exceeds the max time to live specified by `lookup.partial-cache.expire-after-write` or `lookup.partial-cache.expire-after-access`. | ||
The cached rows might not be the latest, users can tune expiration options to a smaller value to have a better fresh data, but this may increase the number of requests send to database. So this is a balance between throughput and correctness. | ||
|
||
By default, flink will cache the empty query result for a Primary key, you can toggle the behaviour by setting `lookup.partial-cache.cache-missing-key` to false. | ||
|
||
Data Type Mapping | ||
---------------- | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be 'lookup.partial-cache.cache-missing-key'