Skip to content

Conversation

@filips123
Copy link
Contributor

@filips123 filips123 commented Oct 1, 2019

This adds support for DNS over HTTPS. It adds additional dns.query.doh() method which uses urllib to get DNS response from DoH resolver. It is used in dns.resolver.Resolver when a DNS nameserver starts with https:// (so it is DoH resolver).

@rthalley Can you check this? Also, when do you plan to release stable 2.0?

@jacobgb24
Copy link

It would probably be useful to add support for both GET and POST methods.

@filips123
Copy link
Contributor Author

@jacobgb24 Ok, I will see what can I do.

But do you have any reasons why both GET and POST should be supported? Do some resolvers only support some of them or there are some differences? Also, how do you think this should be implemented? MAybe first trying POST and falling back to GET?

@jacobgb24
Copy link

jacobgb24 commented Oct 2, 2019

Resolvers must support both per the RFC. I've also never seen an instance of a a resolver supporting one and not the other.

The main reason I'd want support for both would be for testing purposes. In that case, it'd be useful to be able to specify the method as a parameter. Not sure how well this fits into the rest of dnspython though, so it may not make sense.

One last thing is that the RFC lists a benefit of GET over POST:

Using the GET method is friendlier to many HTTP cache
implementations.

@filips123
Copy link
Contributor Author

The main reason I'd want support for both would be for testing purposes. In that case, it'd be useful to be able to specify the method as a parameter. Not sure how well this fits into the rest of dnspython though, so it may not make sense.

So, it could be just added as an additional argument to doh. But I'm worrying here that this could almost double the current size of method.

One last thing is that the RFC lists a benefit of GET over POST:

Yes, but it also says:

POSTed requests are generally smaller than their GET equivalents.

Also, does Python's urllib actually contains that HTTP cache implementations? It would be better to just implement cache directly into DNSPython.

@jacobgb24
Copy link

There are definitely trade-offs to both POST and GET, but this possibly gives more reason to support both.

In terms of caching, it's not limited to the client. There could be caches at any point between the stub and resolver. So regardless of urllib or dnspython support, using GET would be friendlier to those intermediate caches. (See section 5.1)

Overall, I'd say POST has more benefits and is probably sufficient. It would be nice to support both though.

@filips123
Copy link
Contributor Author

filips123 commented Oct 4, 2019

@jacobgb24 I tried to also implement GET support. This is new doh function:

def doh(query, nameserver, post=True):
    """Return the response obtained after sending a query via DoH.

    *query*, a ``dns.message.Message`` containing the query to send

    *nameserver*, a ``str`` containing the nameserver URL.

    *post*, a ``bool`` indicating whether POST method should be used.

    Returns a ``dns.message.Message``.
    """

    wirequery = query.to_wire()
    headers = {
        'Accept': 'application/dns-message',
        'Content-Type': 'application/dns-message',
    }

    if post:
        request = urllib.request.Request(nameserver, data=wirequery, headers=headers)
    else:
        request = urllib.request.Request(nameserver + '?dns=' + wirequery.hex(), headers=headers)

    response = urllib.request.urlopen(request).read()
    return dns.message.from_wire(response)

However, it does not work. It appears that WireQuery is invalid so DNS resolver rejects it. The same query works over POST. This is example URL with invalid WireQuery:

https://doh-de.blahdns.com/dns-query?dns=8d1401000001000000000000027079097a65726f6672616d65036f73730000100001

Which returns:

{
  "Status": 2,
  "Comment": "DNS packet parse failure (dns: buffer size too small)"
}

Because of that, query will not finish. Also, note that I changed post argument to False while testing.

@jacobgb24
Copy link

For GET requests you have to use urlsafe_b64encode() on the wire message.

Here's an example I've used before to give you an idea.

import dns.message
import base64
import requests


def create_query(url, record_type="A"):
    """
    Creates a DNS query in wire format encoded in base64 for use in GET method
    :param url: the url to create a query for e.g. example.com
    :param record_type: the desired record type in string format e.g. AAAA
    :return: the dns message as a b64 string
    """
    message = dns.message.make_query(url, dns.rdatatype.from_text(record_type)).to_wire()
    return base64.urlsafe_b64encode(message).decode('utf-8').strip("=")


def decode_b64_answer(data):
    """
    Decodes a base64 response into wire format
    :param data: the base64 response
    :return: a dns wire message
    """
    message = dns.message.from_wire(data)
    return message


def get_wire(resolver_url, query_name):
    """
    Official RFC method. Send a get request to resolver/dns-query with param dns={base64 encoded dns wire query}
    :param resolver_url: The resolver to query e.g. 1.1.1.1
    :param query_name: The query url e.g. example.com
    :return: a dns.message object received from the resolver
    """
    headers = {"accept": "application/dns-message"}
    payload = {"dns": create_query(query_name)}
    url = "https://{}/dns-query".format(resolver_url)
    try:
        res = requests.get(url, params=payload, headers=headers, stream=True, timeout=10)
        return [a.to_text() for a in decode_b64_answer(res.content).answer]
    except Exception as e:
        return None


print(get_wire("doh-de.blahdns.com", "example.com"))
print(get_wire("cloudflare-dns.com", "github.com"))
print(get_wire("8.8.8.8", "google.com"))

Output is:

['example.com. 50676 IN A 93.184.216.34']
['github.com. 45 IN A 140.82.113.4']
['google.com. 251 IN A 172.217.11.174']

@filips123
Copy link
Contributor Author

@jacobgb24 I added support for GET. It can be enabled with the argument post of doh method.

@filips123
Copy link
Contributor Author

@rthalley Can you merge this?

@rthalley
Copy link
Owner

I will try to look at this one this weekend.

@filips123
Copy link
Contributor Author

filips123 commented Oct 26, 2019

@rthalley Can you now check this?

To help you, here is a description how it works:

  • When making a query, the program first checks if the nameserver is using https scheme that indicated DoH.

  • If so, it uses doh method to get DNS response from DoH resolver using urllib.request. It uses standard wirequery protocol and POST method.

  • The results are then returned and processed just like normal DNS responses.

@rthalley
Copy link
Owner

I have reviewed this code, and have some feedback:

  1. The timeout, one_rr_per_rrset, and ignore_trailing parameters of the udp(), tcp(), and tls() methods are not provided but should be. Likewise the source_port and af parameters of the tcp() and tls() methods are not provided but should be. The 'nameserver' parameter should probably be called 'url', as that's what it is.

  2. The method should be called https() not doh().

  3. urllib is low-level and doesn't do HTTPS 2.0. Should we have better abstraction of the https engine so we could support Requests or (for 2.0) hyper?

  4. should we be rejecting URLs that aren't https urls?

Not sure if we should go for another revision of this patch, or merge this one and then modify it. I also don't know how to test this at the moment. I'll have to see what open source servers do DoH and how to configure them.

@filips123
Copy link
Contributor Author

@rthalley Thanks for your reply! I have some questions about your feedback:

The timeout, one_rr_per_rrset, and ignore_trailing parameters of the udp(), tcp(), and tls() methods are not provided but should be. Likewise the source_port and af parameters of the tcp() and tls() methods are not provided but should be.

How should those parameters be implemented in my method?

The method should be called https() not doh().

Ok, I will rename it.

urllib is low-level and doesn't do HTTPS 2.0. Should we have better abstraction of the https engine so we could support Requests or (for 2.0) hyper?

I think that this should be another PR as this PR already works with HTTPS 1.

should we be rejecting URLs that aren't https urls?

Ok, I can do this.

@rthalley
Copy link
Owner

For the parameters:

For one_rr_per_rrset and ignore_trailing, look at how dns.query.receive_udp() calls dns.message.from_wire(). Note you should also be passing keyring=query.keyring, and request_mac=query.request_mac (dns.query.receive_udp() gets the parameters from its caller, but its caller, dns.query.udp(), but you can see the caller getting them out of the query.

For timeout, you'd want to tell urllib to timeout if requested: timeout, a float or None, the number of seconds to wait before the query times out. If None, the default, wait forever. I haven't looked to see if urllib can do this our how to request it, but I'd be surprised if it couldn't. If it can't, it maybe argues for using something that can.

The source_port and af parameters would be as in the tcp() and tls() commands, and would need to be translated into something relevant for urllib (if it can do it). The source_port parameter is specifying the source port to use for the TCP connection, and the af parameter would be used to say if you wanted only AF_INET or AF_INET6, or if you didn't care. I'm willing to defer these last two to a future time if they're awkward for urllib.

I'm OK with handling HTTP 2.0 in another PR.

@rthalley
Copy link
Owner

Just a note on 'af'. We have it for udp() and tcp() for historical reasons, as dnspython didn't used to be clever enough to figure it out. Now it does. The main useful thing I want is the ability to do a v4 connection or a v6 connection, as this is useful when testing nameservers. I suppose you could also argue that you should just have a name with only v4 addresses, and another name with only v6 addresses.

But I wouldn't work too hard on this one, as it's not that crucial.

@kimbo
Copy link
Contributor

kimbo commented Oct 29, 2019

One thought about this is that some public DNS resolvers that support DoH (e.g. Google, Cloudflare) use the same anycast IP for queries over udp, tcp, tls, and https. So instead of worrying about the https://myserver.com/dns-query, you could (by default) just pass dns.query.https() an IP address (much like udp(), tcp(), and tls()) and construct a URL from there.

For example, if you pass in 1.1.1.1 as the IP, it would send the query to https://1.1.1.1/dns-query.

By default it would use the path /dns-query, but you could add a parameter (named path or something) for different URI templates.

The issue I see with this is you may not know the IP off the top of your head - you may just know the hostname. So maybe it's not the best approach, but it could be nice for some uses.

@filips123
Copy link
Contributor Author

@rthalley I added most of the missing parameters. I didn't add source_port and af parameters as this would be a lot harder to do with urllib but can be added later if needed. Support for HTTPS 2.0 can also be added in another PR.

@kimbo Although most DoH resolvers use this URL format and same IP, not all of them do this. It would be nice to do this, but it can also be in another PR. It would also be good to support DNS Stamps at some later point (PyPI package for them).

@bwelling
Copy link
Collaborator

@kimbo - presumably the https() method could also support a server_hostname parameter, like the tls() method, and use that when constructing the url.

@filips123
Copy link
Contributor Author

@bwelling But how would you handle this from dns.resolver.Resolver?

@bwelling
Copy link
Collaborator

@filips123 - I think that adding support to the resolver is a separate issue from adding the underlying DoH support, and should be treated as such.

@filips123
Copy link
Contributor Author

@bwelling Yes but my changes already add support to the resolver. Also, I think that managing different protocols (TCP/UDP/TLS/DoH) could be better acived with something like DNS Stamps which should also be considered in the future.

@filips123
Copy link
Contributor Author

@bwelling @rthalley Any update?

I think that support for better managing different protocols, like DNS Stamps, should be achived later in the future and shouldn't be a blocker for this PR.

@rthalley rthalley merged commit a3193c8 into rthalley:master Dec 13, 2019
@kimbo kimbo mentioned this pull request Dec 23, 2019
nrhall pushed a commit to nrhall/dnspython that referenced this pull request Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants