Skip to content

Carriage Return characters (\r) are incorrectly replaced with Newline characters (\n) #512

@ericdrobinson

Description

@ericdrobinson
  • Are you running the latest version?
  • Have you included sample input, output, error, and expected output?
  • Have you checked if you are using correct configuration?
  • Did you try online tool?

Description

Carriage Return characters (\r) found within parsed XML are incorrectly converted to newline characters (\n).

I did a quick scan of the source code and found a likely culprit:

const parseXml = function(xmlData) {
  xmlData = xmlData.replace(/\r\n?/g, "\n"); //TODO: remove this line

That pretty clearly replaces any \r character (and possibly \r\n pair) with \n.

The online tool does not exhibit this issue because browser node access APIs encode Carriage Return "strings" (the character \ followed by r) as \\r. The regular expression no longer matches. See:

> "1) \r 2) \\r 3) \n 4) \\n 5) \r\n 6) \\r\\n".replace(/\r\n?/g, "\n")
  '1) \n 2) \\r 3) \n 4) \\n 5) \n 6) \\r\\n'

Input

The XML that exhibits this issue is of the form:

<properties object="" engine="">
    <property type="string" name="x" state="changed">
        <![CDATA[This is a carriage return \r...]]>
    </property>
    <property type="string" name="y" state="changed">
        <![CDATA[\r]]>
    </property>
</properties>

Code

The code is pretty straightforward.

const XML_OPTIONS_NO_TAG_PARSE: fastXMLParser.X2jOptionsOptional = {
    attributeNamePrefix: "@",
    ignoreAttributes: false,
    parseAttributeValue: false,
    parseTagValue: false,
    textNodeName: "#value",
};
const XML_PARSER_NO_TAG_PARSE = new fastXMLParser.XMLParser(XML_OPTIONS_NO_TAG_PARSE);

// ...

const parsed = XML_PARSER_NO_TAG_PARSE.parse(xmlData);

After that code runs, the parsed text node content has \n instead of the expected \r.

Output

Running the above results in the following JSON:

{
    "properties": {
        "property": [
            {
                "#value": "This is a carriage return \n...",
                "@type": "string",
                "@name": "x",
                "@state": "changed"
            },
            {
                "#value": "\n",
                "@type": "string",
                "@name": "y",
                "@state": "changed"
            },
        ],
        "@object": "",
        "@engine": ""
    }
}

Expected Data

I expect the following output:

{
    "properties": {
        "property": [
            {
                "#value": "This is a carriage return \r...",
                "@type": "string",
                "@name": "x",
                "@state": "changed"
            },
            {
                "#value": "\r",
                "@type": "string",
                "@name": "y",
                "@state": "changed"
            },
        ],
        "@object": "",
        "@engine": ""
    }
}

Would you like to work on this issue?

  • Yes
  • No

Metadata

Metadata

Assignees

No one assigned

    Labels

    PendingPending to be confirmed by user/author for some check/update/implementation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions