Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR proposes:

  • Add the automated way of writing error_classes.py file, from pyspark.errors.exceptions import _write_self; _write_self()
  • Fix the formatting of the JSON file to be consistent
  • Fix typos within the error messages
  • Fix parameter names to be consistent (it fixes some, not all)

Why are the changes needed?

  • Now, it is difficult to add a new error class because it enforces alphabetical order or error classes, etc. When you add multiple error classes, you should manually fix and move them around which is troublesome.
  • In addition, the current JSON format isn't very consistent.
  • For consistency. This PR includes the changes of some of parameter naming.

Does this PR introduce any user-facing change?

Yes, it fixes a couple of typos.

How was this patch tested?

Unittests were fixed together.

Was this patch authored or co-authored using generative AI tooling?

No.

@HyukjinKwon
Copy link
Member Author

cc @itholic

Copy link
Contributor

@itholic itholic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment is not actually about the PR itself, so it is LGTM. Let me do some further investigation in separate ticket. Thanks for the improvement!

"Argument `<arg_name>`(type: <arg_type>) should only contain a type in [<allowed_types>], got <return_type>"
"DISALLOWED_TYPE_FOR_CONTAINER": {
"message": [
"Argument `<arg_name>`(type: <arg_type>) should only contain a type in [<allowed_types>], got <item_type>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... does it mean that the parameter name was not matched between template and actual usage but didn't raise any error so far? Let me investigate and create a ticket for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think it wasn't being tested.

error_class="DISALLOWED_TYPE_FOR_CONTAINER",
message_parameters={
"arg_name": "parameters",
"arg_type": type(parameters).__name__,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... this also seems to be problematic. It seems that an error should have occurred if a parameter defined in the template was actually missing. Let me investigate this too.

@itholic
Copy link
Contributor

itholic commented Jan 23, 2024

Fix parameter names to be consistent (it fixes some, not all)

Also we might need further effort to refine the error class (duplicated name, consistency of messages, etc.). Will add some more items into SPARK-45673. (and maybe also include this ticket into SPARK-45673 ?)

@HyukjinKwon
Copy link
Member Author

Sure. let me put this ticket under that.

# limitations under the License.
#
# NOTE: Automatically sort this file via
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If error_classes.py is not meant to be edited manually, I would add a clear warning here so people don't mistakenly edit the file, similar to #44847.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually this case is slightly different. Has to be manually edited first, and then reformatted via that code :-).

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@HyukjinKwon
Copy link
Member Author

Merged to master.

zhengruifeng pushed a commit that referenced this pull request Jan 25, 2024
… `test_error_classes_sorted`

### What changes were proposed in this pull request?

This PR is a followup of #44848 that adds a bit of guide to sort the error classes.

### Why are the changes needed?

For developers to easily sort the error classes.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44874 from HyukjinKwon/minor-error-sort.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants