Skip to content

net/http: RawPath shows inconsistent, case-sensitive behaviour with percentage encoded Unicode URI strings #33596

Open
@knadh

Description

@knadh

What version of Go are you using (go version)?

$ go version
go version go1.12.7 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

Any environment

go env Output
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/user/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/user/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build606111517=/tmp/go-build -gno-record-gcc-switches"

What did you do?

package main

import (
	"fmt"
	"html/template"
	"net/http"
	"net/url"
)

type content struct {
	TplEncoded      string
	ManuallyEncoded template.URL

	ShowPaths bool
	RawPath   string
	Path      string
}

func main() {
	tpl, _ := template.New("test").Parse(`<!doctype html>
	<head>
		<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
		<meta charset="utf-8" />
	</head>
	<body>
		{{ if .ShowPaths }}
			<p>RawPath = {{ .RawPath }}</p>
			<p>Path = {{ .Path }}</p>
		{{ else }}
			<a href="/link/{{ .TplEncoded }}">Template encoded link</a><br />
			<a href="/link/{{ .ManuallyEncoded }}">Manually encoded link</a>
			<br />
			<p>View this page's source to see the (lower/upper)case difference
			in the links</p>
		{{ end }}
	</body>
	</html>`)

	// Renders the root with good and bad links.
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		s := "😋" // Unicode emoji.
		tpl.Execute(w, content{
			// html/template encodes into lowercase characters.
			TplEncoded: s,

			// url.PathEscape encodes into uppercase characters.
			ManuallyEncoded: template.URL(url.PathEscape(s)),
		})
	})

	// This handler produces inconsistent RawPath based on (upper/lower)case encoding in the URI.
	http.HandleFunc("/link/", func(w http.ResponseWriter, r *http.Request) {
		tpl.Execute(w, content{
			ShowPaths: true,
			RawPath:   r.URL.RawPath,
			Path:      r.URL.Path,
		})
	})

	fmt.Println("Go to http://127.0.0.1:8080")
	http.ListenAndServe(":8080", nil)
}

What did you expect to see?

url.PathEscape("😋") => %F0%9F%98%8B

/link/%F0%9F%98%8B (A) and /link/%f0%9f%98%8b (B) (upper and lower case respectively) are equivalent as per RFC 3986. An http.HandlerFunc() handling either of the URLs is expected to show consistent behaviour.

What did you see instead?

An http handler that processes the identical URIs A and B behaves differently. B, which has uppercase characters, produces an empty http.Request.URL.RawPath where as A that has lowercase characters produces an http.Request.URL.RawPath with unescaped characters. This breaks Unicode URL handling in popular HTTP routers like chi and httprouter.

Discovered this inconsistency when using html/template that encodes Unicode strings in <a> to have lowercase characters as opposed to url.PathEscape that produces uppercase characters.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions