Skip to content

Verbosity replacers #659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 3, 2019
Merged

Conversation

mihailik
Copy link
Contributor

@mihailik mihailik commented Feb 8, 2019

Using a short array of variative RegExps, to match (the|a) and (object|event|interface) and (is|represents).

  • In a simple case, a whole intro lead is just removed.
  • In a more complex case, a long intro is replaced with a shorter one.

Replacers are tried in array order, until one succeeds. More tailored replacers can go first, and if those fail -- less clever ones can have a go.

In either case, after replace the first letter of description is re-capitalised again.

It's only 6 replacers, hope easy to review/maintain.

@mihailik
Copy link
Contributor Author

mihailik commented Feb 8, 2019

OK, I've scanned through all the changes -- all wording fit and proper. Please review @sandersn @jwbay

@saschanaz
Copy link
Contributor

Looks good but I think the function should be in idlfetcher.ts (where processComments() exists).

@mihailik
Copy link
Contributor Author

That would not be a good idea, @saschanaz

Processing string replacement AFTER the docs imported in JSON leaves a useful trace of text before and after string replacement.

While well-tuned to fit future MDN text, the string replacers are not fool-proof. One day they may produce weird results, and having such trace make diagnostics and tweaking process smoother.


As I've worked on this PR, it took me a few iterations to get string replacers right. Being able to see before/after states is key to productivity.

There's a lot of text to process, hundreds of replacement matches. With this current implementation, in case of doubt you consult original MDN text sitting in JSON. Otherwise you would have to consult MDN website instead, which can't really scale for 100s of replacers.

@saschanaz
Copy link
Contributor

Makes sense. Thanks for clarification.

@mihailik
Copy link
Contributor Author

mihailik commented Feb 12, 2019

Ping @sandersn do you agree?

Might be cool to get it in, so new MDN docs look even better, come next release.

Copy link
Member

@sandersn sandersn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments, but otherwise looks great.

@mihailik
Copy link
Contributor Author

Good, applied your changes @sandersn and everything tidy and works. Thanks!

src/index.ts Outdated
[/^(The|A) ${name} (interface|event|object) (is|represents|represent|describes|defines)?/, ''],
[/^An object implementing the ${name} interface (is|represents|represent|describes|defines)/, ''],
[/^The ${name} is an interface representing/, ''],
[/^This type (is|represents|represent|describes|defines)?/, ''],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like represent is not needed anymore for the latest MDN descriptions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihailik We still have one represent, is it still needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep it in this code. This can slip in at any time again in MDN.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can fix typos on MDN in any time. Better to fix on MDN rather than to workaround on TSJS.

@saschanaz
Copy link
Contributor

saschanaz commented Feb 15, 2019

@mihailik Can you try running npm run fetch-mdn and rebuild? Looks like the latest descriptions need some additional rules. Especially Storage and SubtleCrypto, CryptoKey, etc.

@mihailik mihailik force-pushed the Mervyn-Peake-pacify branch from ce8a662 to d12ae59 Compare February 15, 2019 18:22
@mihailik
Copy link
Contributor Author

Updated:

  • removing cautious 'represent' handling of a typo (as fixed in the upstream),
  • re-running fetch-mdn from upstream.

Works good, and spotted this quirk coming from the original upstream text:
inputfiles/mdn/apiDescriptions.json

  "IntersectionObserver": "provides a way to asynchronously observe changes in the intersection of a target element with an ancestor element or with a top-level document's viewport.",

The actual MDN page for IntersectionObserver doesn't have that quirky text, must be an issue with the import?

image

Note: this is not a result of my changes. Just as I referred above, the MDN fetch process and verbosity string replacers are two separate steps. The mangling happened before verbosity, somewhere in scraping the docs from MDN (or in the way MDN exports the docs maybe?)

@mihailik
Copy link
Contributor Author

@saschanaz the examples you've mentioned seem fine. Text is meaningful, and fluent enough.

However IntersectionObserver is mangled by fetch-mdn it seems.

@saschanaz
Copy link
Contributor

the examples you've mentioned seem fine.

CryptoKey currently says:

The CryptoKey dictionary of the Web Crypto API represents a cryptographic key.

I thought you would be prefer A cryptographic key. instead to remove redundant things.

@mihailik
Copy link
Contributor Author

mihailik commented Feb 16, 2019

Wiped one more missed 'represent' (thanks @sandersn), but decided against updating the rules to handle CryptoKey (sorry @saschanaz).

CryptoKey is a good catch, I haven't noticed it earlier. However it should be left as is, here's why. The 'API' replacer looks like this:

[/^(The|A) ${name} interface of (the\s*)*([a-z\s]+ API)(\\\'s)?/, 'An interface of the $3 ']

If CryptoKey were referred to as an interface, the conversion would have been:

The CryptoKey interface of the Web Crypto API represents a cryptographic key. -->
An interface of the Web Crypto API represents a cryptographic key.

As you can see, it's not meant to convert into A cryptographic key. anyway, as I do want to retain the Web Crypto API part too.

And the issue with CryptoKey being a dictionary not an interface is the article. The same replacer can't deal with both 'interface' and 'dictionary' in a kind of polymorphic way, because it need to coordinate articles.

We can introduce another replacer, but it would only target 2 cases: CryptoKey and CryptoKeyPair. This to me looks too much fine-tuning for now. I want to lay some groundwork, do replacements for some broad rules, and enable fine-tuning for further PRs, if needed.

@saschanaz
Copy link
Contributor

So we had some mistakes there:

  1. CryptoKey is an interface, not a dictionary. Looks like someone in MDN made a mistake, but I just fixed it.
  2. CryptoKeyPair is indeed a dictionary but TSJS thinks it's an interface. We should fix it.

Thanks for clarification 😀

@sandersn sandersn merged commit e8ef1be into microsoft:master Apr 3, 2019
@sandersn
Copy link
Member

sandersn commented Apr 3, 2019

Note that I merged from master but did not re-run fetch-mdn. The results looked funny so I left it alone.

@mihailik
Copy link
Contributor Author

mihailik commented Apr 3, 2019

Thank you, thank you @sandersn!

Sorry I left it ferment for long, hope it brewed into a hearty dish :-)

And thanks for making TypeScript such a useful tool. Good job folks!

@sandersn
Copy link
Member

sandersn commented Apr 3, 2019

No problem. I've only been merging PRs here at the beginning of each release, so it's mostly my fault.

@mihailik mihailik deleted the Mervyn-Peake-pacify branch April 11, 2019 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants