Skip to content

Warn against unicode characters #1350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shaneing opened this issue Oct 12, 2019 · 9 comments
Open

Warn against unicode characters #1350

shaneing opened this issue Oct 12, 2019 · 9 comments

Comments

@shaneing
Copy link

Summary of the new feature

When using the double quote in Chinese, PSSA should throw an error or warning.

Proposed technical implementation details (optional)

Three cases as follows:

Invoke-ScriptAnalyzer -ScriptDefinition '$a = "b“; Write-Output $a'
Invoke-ScriptAnalyzer -ScriptDefinition '$a = “b“; Write-Output $a'
Invoke-ScriptAnalyzer -ScriptDefinition '$a = “b"; Write-Output $a'

What is the latest version of PSScriptAnalyzer at the point of writing

v1.18.3

@bergmeister
Copy link
Collaborator

Why? What is the problem with double quotes in Chinese?

@shaneing
Copy link
Author

shaneing commented Oct 12, 2019

Why? What is the problem with double quotes in Chinese?

I run the script in powershell 4.0 (Windows Server 2012 R2) and its output as follows:

+ $a = "b�Write-Output $a
+      ~~~~~~~~~~~~~~~~~~~
The string is missing terminator:
    + CategoryInfo          : ParserError: (:) [], ParseExcept
    + FullyQualifiedErrorId : TerminatorExpectedAtEndOfString

@bergmeister
Copy link
Collaborator

Is this a true double quote or a non-standard unicode character that just looks like a double quote?
What is the behaviour for PowerShell 5.1 or 6.2?
Is this known @rjmholt ?

@shaneing
Copy link
Author

It's no problem with PowerShell 6.2.

@rjmholt
Copy link
Contributor

rjmholt commented Oct 14, 2019

It's because of the encoding of the script. You want PSUseBOMForUnicodeEncodedFile.

The script is in UTF-8, but Windows PowerShell 5.1 and under default to an extended ASCII encoding. PS 6+ handles it fine because it defaults to UTF-8 when there's no BOM. The character itself is a perfectly legal string terminator in all versions of PowerShell.

If you re-save your script as UTF-8 with BOM, WinPS will pick that up and it will work. Please be mindful of source encoding; the PowerShell interpreter only sees bytes, so it's something you must explicitly handle in your editor and the way in which you move scripts around. If you share code, ensure that everyone is saving the file with a portable encoding (the ISE is not good for this).

@shaneing
Copy link
Author

It's because of the encoding of the script. You want PSUseBOMForUnicodeEncodedFile.

The script is in UTF-8, but Windows PowerShell 5.1 and under default to an extended ASCII encoding. PS 6+ handles it fine because it defaults to UTF-8 when there's no BOM. The character itself is a perfectly legal string terminator in all versions of PowerShell.

If you re-save your script as UTF-8 with BOM, WinPS will pick that up and it will work. Please be mindful of source encoding; the PowerShell interpreter only sees bytes, so it's something you must explicitly handle in your editor and the way in which you move scripts around. If you share code, ensure that everyone is saving the file with a portable encoding (the ISE is not good for this).

ok, I get it now, thanks. The reason is that the script automated encoding conversion to UTF-8 when I input the double quote in Chinese uncarefully. I do not recommend using it. So ...

@rjmholt
Copy link
Contributor

rjmholt commented Oct 15, 2019

I do not recommend using it. So ...

That's a fair point. A rule that warns against obscure but accepted characters in PowerShell scripts might be a good idea. Particularly long dashes and styled quotes, which tend to come from MS Word.

@bergmeister
Copy link
Collaborator

bergmeister commented Oct 15, 2019

There is already an issue with a viable solution where someone made a vs code extension, see here
#981 (comment)
https://marketplace.visualstudio.com/items?itemName=GlenBuktenica.unicode-substitutions

There is also another vs code extension to highlight dodgy characters
https://marketplace.visualstudio.com/items?itemName=nachocab.highlight-dodgy-characters

@rjmholt rjmholt changed the title Checking the double quote in Chinese Warn against unicode characters Oct 15, 2019
@rjmholt
Copy link
Contributor

rjmholt commented Oct 15, 2019

Going to close this as a duplicate of #981

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants