Skip to content

Attributes that have no value get their name as their value #17

Open
@simbabque

Description

@simbabque

When investigating libwww-perl/WWW-Mechanize#125 I noticed that the following HTML parses weirdly.

<input type="hidden" name="foo" value>

According to the HTML spec on an input element a value attribute that's not followed by an equals = should be empty, so we should be parsing it to an empty string.

Empty attribute syntax
Just the attribute name. The value is implicitly the empty string.

Instead of making it empty, we set it to "value".

I've looked into it, and got as far as that get_tag returns a data structure that contains the wrong value:

\ [
    [0] "input",
    [1] {
        /       "/",
        name    "foo",
        type    "hidden",
        value   "value"
    },
    [2] [
        [0] "type",
        [1] "name",
        [2] "value",
        [3] "/"
    ],
    [3] "<input type="hidden" name="foo" value />"
]

Unfortunately I am out of my depths with the actual C code for the parser. But I think, we should be returning an empty string for the value attribute, as well as all other empty attributes.


I wrote the following test to demonstrates the problem.

use strict;
use warnings;

use HTML::TokeParser ();
use Test::More;
use Data::Dumper;

ok(
    !get_tag(q{})->{value},
    'No value when there was no value'
);    # key does not exist

{
    # this fails because value is 'value'
    my $t = get_tag(q{value});
    ok(
        !$t->{value},
        'No value when value attr has no value'
    ) or diag Dumper $t;    
}

ok(
    !get_tag(q{value=""})->{value},
    'No value when value attr is an empty string'
);    # key is an empty string

is(
    get_tag(q{value="bar"})->{value}, 
    'bar', 
    'Value is bar'
);    # this obviously works

sub get_tag {
    my $attr = shift;
    return HTML::TokeParser->new(\qq{<input type="hidden" name="foo" $attr />})->get_tag->[1];
}

done_testing;

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions