Skip to content

Reading docx with <h?> containing anchors the ancors just get removed. #1792

@bozzit

Description

@bozzit

Describe the Bug

if you are reading a word document that contains headings with anchors in them getContent() returns the heading with no anchor or anchor text.

Steps to Reproduce

word document containing: (Sample Docx Attached)
Biographies <- H1
Regular Anchor Aaaa bozz <- p /* Bozz are hyperlinks to http://www.xyz.com /
AAAAA Bozz, Anchor in Heading <-- h2 /
Bozz are hyperlinks to http://www.xyz.com */
On January 31st, 2019, Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus hendrerit pellentesque nisl. Vivamus lobortis enim consequat egestas suscipit. In convallis metus id erat eleifend consectetur. Donec tincidunt, dui quis congue sollicitudin, metus arcu mattis erat, sed rutrum eros odio quis ex. Vestibulum sit amet viverra est. Nullam ultrices commodo metus vel iaculis. Fusce nec blandit leo. Curabitur id lacinia libero. Etiam nunc arcu, pharetra sit amet felis non, congue bibendum magna. Duis semper nec metus ac vehicula.

Please provide a code sample that reproduces the issue.

<?php
require_once ('bootstrap.php');

$phpWord = \PhpOffice\PhpWord\IOFactory::load('test.docx');
$htmlWriter = new \PhpOffice\PhpWord\Writer\HTML($phpWord);
$content = $htmlWriter->getContent();

echo $content;

Expected Behavior

Biographies

Regular Anchor Aaaa bozz

AAAAA , Ancor **bozz** in Heading

On January 31st, 2019, Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus hendrerit pellentesque nisl. Vivamus lobortis enim consequat egestas suscipit. In convallis metus id erat eleifend consectetur. Donec tincidunt, dui quis congue sollicitudin, metus arcu mattis erat, sed rutrum eros odio quis ex. Vestibulum sit amet viverra est. Nullam ultrices commodo metus vel iaculis. Fusce nec blandit leo. Curabitur id lacinia libero. Etiam nunc arcu, pharetra sit amet felis non, congue bibendum magna. Duis semper nec metus ac vehicula.

Current Behavior

Biographies

Regular Ancor Aaaa bozz

AAAAA , Ancor in Heading

On January 31st, 2019, Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus hendrerit pellentesque nisl. Vivamus lobortis enim consequat egestas suscipit. In convallis metus id erat eleifend consectetur. Donec tincidunt, dui quis congue sollicitudin, metus arcu mattis erat, sed rutrum eros odio quis ex. Vestibulum sit amet viverra est. Nullam ultrices commodo metus vel iaculis. Fusce nec blandit leo. Curabitur id lacinia libero. Etiam nunc arcu, pharetra sit amet felis non, congue bibendum magna. Duis semper nec metus ac vehicula.

Context

Please fill in your environment information:

  • PHP 7.1.33
  • PHPWord Version: 0.17.0 and tried [dev-develop]
    test.docx

Thanks for any insight, workaround or nudge in the right direction.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions