Skip to content

Fix test_long_examples_validator #406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

tuliren
Copy link
Contributor

@tuliren tuliren commented Apr 19, 2023

Currently test_long_examples_validator fails with the following error message:

>       assert prepared_data_cmd_output.stderr == ""
E       assert ('Traceback (most recent call last):\n'\n '  File "/Users/lirentu/.pyenv/versions/3.9.11/bin/openai", line 8, in '\n '<module>\n'\n '    sys.exit(main())\n'\n '  File "/Users/lirentu/git/openai-python/openai/_openai_scripts.py", line '\n '78, in main\n'\n '    args.func(args)\n'\n '  File "/Users/lirentu/git/openai-python/openai/cli.py", line 592, in '\n 'prepare_data\n'\n '    apply_validators(\n'\n '  File "/Users/lirentu/git/openai-python/openai/validators.py", line 843, in '\n 'apply_validators\n'\n '    df, optional_applied = apply_optional_remediation(\n'\n '  File "/Users/lirentu/git/openai-python/openai/validators.py", line 611, in '\n 'apply_optional_remediation\n'\n '    df = remediation.optional_fn(df)\n'\n '  File "/Users/lirentu/git/openai-python/openai/validators.py", line 425, in '\n 'add_space_start\n'\n '    x["completion"] = x["completion"].apply(\n'\n '  File '\n '"/Users/lirentu/.pyenv/versions/3.9.11/lib/python3.9/site-packages/pandas/core/series.py", '\n 'line 4433, in apply\n'\n '    return SeriesApply(self, func, convert_dtype, args, kwargs).apply()\n'\n '  File '\n '"/Users/lirentu/.pyenv/versions/3.9.11/lib/python3.9/site-packages/pandas/core/apply.py", '\n 'line 1088, in apply\n'\n '    return self.apply_standard()\n'\n '  File '\n '"/Users/lirentu/.pyenv/versions/3.9.11/lib/python3.9/site-packages/pandas/core/apply.py", '\n 'line 1143, in apply_standard\n'\n '    mapped = lib.map_infer(\n'\n '  File "pandas/_libs/lib.pyx", line 2870, in pandas._libs.lib.map_infer\n'\n '  File "/Users/lirentu/git/openai-python/openai/validators.py", line 426, in '\n '<lambda>\n'\n '    lambda x: ("" if x[0] == " " else " ") + x\n'\n 'IndexError: string index out of range\n') == ''
E         + Traceback (most recent call last):
E         +   File "/Users/lirentu/.pyenv/versions/3.9.11/bin/openai", line 8, in <module>
E         +     sys.exit(main())
E         +   File "/Users/lirentu/git/openai-python/openai/_openai_scripts.py", line 78, in main
E         +     args.func(args)
E         +   File "/Users/lirentu/git/openai-python/openai/cli.py", line 592, in prepare_data
E         +     apply_validators(
E         +   File "/Users/lirentu/git/openai-python/openai/validators.py", line 843, in apply_validators
E         +     df, optional_applied = apply_optional_remediation(
E         +   File "/Users/lirentu/git/openai-python/openai/validators.py", line 611, in apply_optional_remediation
E         +     df = remediation.optional_fn(df)
E         +   File "/Users/lirentu/git/openai-python/openai/validators.py", line 425, in add_space_start
E         +     x["completion"] = x["completion"].apply(
E         +   File "/Users/lirentu/.pyenv/versions/3.9.11/lib/python3.9/site-packages/pandas/core/series.py", line 4433, in apply
E         +     return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
E         +   File "/Users/lirentu/.pyenv/versions/3.9.11/lib/python3.9/site-packages/pandas/core/apply.py", line 1088, in apply
E         +     return self.apply_standard()
E         +   File "/Users/lirentu/.pyenv/versions/3.9.11/lib/python3.9/site-packages/pandas/core/apply.py", line 1143, in apply_standard
E         +     mapped = lib.map_infer(
E         +   File "pandas/_libs/lib.pyx", line 2870, in pandas._libs.lib.map_infer
E         +   File "/Users/lirentu/git/openai-python/openai/validators.py", line 426, in <lambda>
E         +     lambda x: ("" if x[0] == " " else " ") + x
E         + IndexError: string index out of range

openai/tests/test_long_examples_validator.py:50: AssertionError

The root cause is that the x in the lambda can be an empty string.

This PR fixes this issue by adding an extra check. It assumes that when the string is empty, no extra space is needed.

Such issue can be prevented by adding a CI to run the unit tests automatically. You may want to merge #203. I have also put up a CI PR tuliren#1 with all tests passing on my fork, and can re-create the PR against this repo if that's helpful.

Copy link

@aelnosu aelnosu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both work but change is better
"Both functions aim to add a space at the start of a string if it is not already present. However, there is a subtle difference in their implementation.

The first function checks if the first character of the string is a space using x[0] == " ". If it is not, it adds a space at the beginning using "" if x[0] == " " else " ". Note that this function assumes that the input string is not empty. If the input string is empty, this function will throw an IndexError.

The second function also checks if the first character of the string is a space using x[0] == " ". However, it also checks if the string is empty using x == "". If the string is empty, it simply returns the string as is. Otherwise, it adds a space at the beginning using "" if x == "" or x[0] == " " else " ".

Therefore, the second function is more robust as it handles the case where the input string is empty."

@@ -423,7 +423,7 @@ def completions_space_start_validator(df):

def add_space_start(x):
x["completion"] = x["completion"].apply(
lambda x: ("" if x[0] == " " else " ") + x
lambda x: ("" if x == "" or x[0] == " " else " ") + x

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion. I have updated the PR. Also I changed the x in the lambda to s so that it does not clash with the input parameter x.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I've verified that (as before) this makes the test pass when applied on top of b82a3f7.

@rattrayalex
Copy link
Collaborator

Thanks for this!

We've since rewritten the library entirely, so this change is no longer relevant. I'm sorry we didn't get to it sooner.

@rattrayalex rattrayalex added the fixed in v1 Issues addressed by the v1 beta label Nov 10, 2023
@rattrayalex
Copy link
Collaborator

Ah, actually this change is still relevant, the code has just moved. I'll be applying this change in a separate PR shortly.

@tuliren tuliren deleted the liren/fix-unit-test branch November 19, 2023 02:58
stainless-app bot pushed a commit that referenced this pull request Mar 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed in v1 Issues addressed by the v1 beta
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants