Skip to content

Conversation

@sebinsunny
Copy link
Contributor

@sebinsunny sebinsunny commented Mar 23, 2023

Implement FilterByFieldValue transformation to filter records based on field value or a given regex pattern

@sebinsunny sebinsunny requested review from a team as code owners March 23, 2023 02:21
@sebinsunny sebinsunny force-pushed the sebinsunny-filter-transformation branch 4 times, most recently from e1f9a53 to 9e6871b Compare March 23, 2023 03:13
@ivanyu ivanyu mentioned this pull request Mar 24, 2023
Copy link
Contributor

@jeqo jeqo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sebinsunny! I think this could be a useful SMT to add.
I left some comments to discuss.

@sebinsunny sebinsunny force-pushed the sebinsunny-filter-transformation branch 5 times, most recently from 0a27203 to 0b19bc9 Compare March 24, 2023 12:05
adding support for regex-based field value matching and configurable filter modes (equality and inequality). This allows filtering records based on whether a specific field value matches or does not match a given regex pattern
@sebinsunny sebinsunny force-pushed the sebinsunny-filter-transformation branch from 0b19bc9 to 5cec48a Compare March 24, 2023 12:09
@sebinsunny sebinsunny requested a review from jeqo March 24, 2023 12:11
@sebinsunny sebinsunny force-pushed the sebinsunny-filter-transformation branch 2 times, most recently from e986948 to f6c0880 Compare March 28, 2023 14:41
@sebinsunny sebinsunny force-pushed the sebinsunny-filter-transformation branch from f6c0880 to 2eb9477 Compare March 28, 2023 14:43
@sebinsunny sebinsunny changed the title transforms: add FilterFields transformation transforms: add FilterByFieldValue transformation Mar 29, 2023
Copy link
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to give this a quick pass and leave some thoughts; feel free to take them or leave them. I probably won't have time to come back and give this another round so please don't block on my approval.

public void configure(final Map<String, ?> configs) {
final AbstractConfig config = new AbstractConfig(config(), configs);
this.fieldName = config.getString("field.name");
this.fieldExpectedValue = Optional.ofNullable(config.getString("field.value"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to allow users to match more than only primitive values. We could use the Values class to parse this property which would allow us to read in more complex types. In addition, this would save unnecessary calls to toString on field values.

This also follows the precedent set in other transforms like InsertHeader and how it parses its "value.literal" property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the getStructFieldValue and getSchemalessFieldValue methods to parse more complex types using the Values class
cc @jeqo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! Didn't know about Values class, looks like a great match here.
I guess the only potential issue I foresee is when there is a mismatch between parsed schema and the desired one, e.g. if "123" is used as expected value, the inferred schema is INT8, when you may want STRING. Though, to start with this looks good to me. If an issue arise around this we could add a config for expected type.

@sebinsunny sebinsunny force-pushed the sebinsunny-filter-transformation branch from 6073212 to ea593c0 Compare March 30, 2023 00:42
@sebinsunny sebinsunny force-pushed the sebinsunny-filter-transformation branch from ea593c0 to 6452aeb Compare March 30, 2023 00:46
@HelenMel HelenMel merged commit c9855d7 into master Mar 30, 2023
@HelenMel HelenMel deleted the sebinsunny-filter-transformation branch March 30, 2023 12:21
@jeqo jeqo mentioned this pull request Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants