-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Open
Labels
Description
Attach (recommended) or Link to PDF file
The getStructTree
builds up the tree starting from content refs.
If a struct node (such as a TD
) does not include a content, no StructElementNode
is created for that branch.
I don't know if this is actually a bug or the intended behavior. If you don't consider it a bug, would you accept a PR that modifies this behavior via a property to get the full tree?
Web browser and its version
Operating system and its version
PDF.js version
v5.4.149
Is the bug present in the latest PDF.js version?
Yes
Is a browser extension
No
Steps to reproduce the problem
- Load the attached PDF
- Get page 1
- Get structured tree for page 1
What is the expected behavior?
The table branch should contain all 4 TD nodes.
What went wrong?
Table contains the first two TD nodes only (The first has the word word
, the second a whitespace, third and fourth are empty).
Link to a viewer
No response
Additional context
No response