Skip to content

End offset for text containing entities is incorrect #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
siefkenj opened this issue Dec 18, 2022 · 3 comments
Closed
4 tasks done

End offset for text containing entities is incorrect #3

siefkenj opened this issue Dec 18, 2022 · 3 comments
Labels
💪 phase/solved Post is done 🐛 type/bug This is a problem

Comments

@siefkenj
Copy link

Initial checklist

Affected packages and versions

2.0.1

Link to runnable example

No response

Steps to reproduce

Exectue fromXml("<foo>text&amp;text</foo>")

Expected behavior

The offset position should start at 5 and end at 18

Actual behavior

The offset position should start at 5 and end at 15

Affected runtime and version

[email protected]

Affected package manager and version

No response

Affected OS and version

No response

Build and bundle tools

No response

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Dec 18, 2022
@wooorm
Copy link
Member

wooorm commented Dec 19, 2022

Hmm, the positional info of sax isn’t very good.
I’m not a giant fan of sax, it’s very loose, pretty big.

I think I’d like to switch to something else. Perhaps https://github.com/rgrove/parse-xml. Seems fast. Halve the size. Well-maintained and modern. Doesn’t expose positional info from what I can see though

@siefkenj
Copy link
Author

siefkenj commented Dec 19, 2022

Another parser is probably a good idea. I encountered this error when working on a converter from lezer-xml to xast.

I have it basically working: https://github.com/siefkenj/typescript-xml/tree/main/packages/xml/src/lezer/lezer-to-xast

The only differences between fromXml and what I just made are correct positions and no errors (lezer is designed to parse no matter what, though it can be instructed to throw on the first error).

@wooorm wooorm closed this as completed in 07a5e57 Feb 5, 2023
@github-actions

This comment has been minimized.

@wooorm wooorm added 🐛 type/bug This is a problem 💪 phase/solved Post is done labels Feb 5, 2023
@github-actions github-actions bot removed the 🤞 phase/open Post is being triaged manually label Feb 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 phase/solved Post is done 🐛 type/bug This is a problem
Development

No branches or pull requests

2 participants