5 minute read  

DNS Search Path Optimization

DNS Search Path

Using fully qualified names has some downsides, e.g. it may become harder to move deployments from one landscape to the next. It is far easier and simple to rely on short/local names, which may have different meaning depending on the context they are used in.

The DNS search path allows the usage of short/local names. It is an ordered list of DNS suffixes to append to short/local names to create a fully qualified name.

If a short/local name should be resolved each entry is appended to it one by one to check whether it can be resolved. The process stops when either the name could be resolved or the DNS search path ends. As the last step after trying the search path, the short/local name is attempted to be resolved on it own.

DNS Option ndots

As explained in the section above, the DNS search path is used for short/local names to create fully qualified names. The DNS option ndots specifies how many dots (.) a name needs to have to be considered fully qualified. For names with less than ndots dots (.), the DNS search path will be applied.

DNS Search Path, ndots and Kubernetes

Kubernetes tries to make it easy/convenient for developers to use name resolution. It provides several means to address a service, most notably by its name directly, using the namespace as suffix, utilizing <namespace>.svc as suffix or as a fully qualified name as <service>.<namespace>.svc.cluster.local (assuming cluster.local to be the cluster domain).

This is why the DNS search path is fairly long in Kubernetes, usually consisting of <namespace>.svc.cluster.local, svc.cluster.local, cluster.local and potentially some additional entries coming from the local network of the cluster. For various reasons, the default ndots value in the context of Kubernetes is with 5 also fairly large. See this comment for a more detailed description.

DNS Search Path/ndots Problem in Kubernetes

As the DNS search path is long and ndots is large, a lot of DNS queries might traverse the DNS search path. This results in an explosion of DNS requests.

For example, consider the name resolution of the default kubernetes service kubernetes.default.svc.cluster.local. As this name has only four dots it is not considered a fully qualified name according to the default ndots=5 setting. Therefore, the DNS search path is applied resulting in the following queries being created

  • kubernetes.default.svc.cluster.local.some-namespace.svc.cluster.local
  • kubernetes.default.svc.cluster.local.svc.cluster.local
  • kubernetes.default.svc.cluster.local.cluster.local
  • kubernetes.default.svc.cluster.local.network-domain

In IPv4/IPv6 dual stack systems, the amount of DNS requests may even double as each name is resolved for IPv4 and IPv6.

General Workarounds/Mitigations

Kubernetes provides the capability to set the DNS options for each pod (see Pod DNS config for details). However, this has to be applied for every pod (doing name resolution) to resolve the problem. A mutating webhook may be useful in this regard. Unfortunately, the DNS requirements may be different depending on the workload. Therefore, a general solution may difficult to impossible.

Another approach is to use always fully qualified names and append a dot (.) to the name to prevent the name resolution system from using the DNS search path. This might be somewhat counterintuitive as most developers are not used to the trailing dot (.). Furthermore, it makes moving to different landscapes more difficult/error-prone.

Gardener specific Workarounds/Mitigations

Gardener allows users to customize their DNS configuration. CoreDNS allows several approaches to deal with the requests generated by the DNS search path. Caching is possible as well as query rewriting. There are also several other plugins available, which may mitigate the situation.

Gardener DNS Query Rewriting

As explained above, the application of the DNS search path may lead to the undesired creation of DNS requests. Especially with the default setting of ndots=5, seemingly fully qualified names pointing to services in the cluster may trigger the DNS search path application.

Gardener allows to automatically rewrite some obviously incorrect DNS names, which stem from application of the DNS search path, to the most likely desired name. The feature can be enabled by setting the Gardenlet feature gate CoreDNSQueryRewriting to true:

featureGates:
  CoreDNSQueryRewriting: true

In case the feature is enabled in the Gardenlet it can be disabled per shoot cluster by setting the annotation alpha.featuregates.shoot.gardener.cloud/core-dns-rewriting-disabled to any value.

This will automatically rewrite requests like service.namespace.svc.cluster.local.other-namespace.svc.cluster.local to service.namespace.svc.cluster.local. The same holds true for service.namespace.svc.other-namespace.svc.cluster.local, which will also be rewritten to service.namespace.svc.cluster.local.

In case applications also target services for name resolution, which are outside of the cluster and have less than ndots dots, it might be helpful to prevent search path application for them as well. One way to achieve it is by adding them to the commonSuffixes:

...
spec:
  ...
  systemComponents:
    coreDNS:
      rewriting:
        commonSuffixes:
        - gardener.cloud
        - github.com
...

DNS requests containing a common suffix and ending in <namespace>.svc.cluster.local are assumed to be incorrect application of the DNS search path. Therefore, they are rewritten to everything ending in the common suffix. For example, www.gardener.cloud.namespace.svc.cluster.local would be rewritten to www.gardener.cloud.

Please note that the common suffixes should be long enough and include enough dots (.) to prevent random overlap with other DNS queries. For example, it would be a bad idea to simply put com on the list of common suffixes as there may be services/namespaces, which have com as part of their name. The effect would be seemingly random DNS requests. Gardener enforces at least two dots (.) in the common suffixes.