Skip to content

Conversation

@janpieper
Copy link
Contributor

@janpieper janpieper commented Oct 15, 2025

This PR adds a new option address_type which can be :ip (default) or :hostname to control whether to connect to other nodes via their ip address or hostname.

The resolver lookup function changed from 2-arity to 3-arity, because we now also need to pass the new address_type to it.

To not break backwards compatibility for users with custom resolver implementations (e.g. custom dns servers, DB lookup), the implementation calls lookup/3 if available, calls lookup/3 when address_type == :hostname and otherwise falls back to lookup/2.

Setup

$ dig A app.dev.acme.com +short
192.168.100.1

$ dig AAAA app.dev.acme.com +short
::ffff:192.168.100.1

$ dig SRV app.dev.acme.com +short
1 10 4000 app-1.dev.acme.com
1 10 4000 app-2.dev.acme.com
1 10 4000 app-3.dev.acme.com

$ dig A app-1.dev.acme.com +short
192.168.100.101

$ dig A app-2.dev.acme.com +short
192.168.100.102

$ dig A app-3.dev.acme.com +short
192.168.100.103

Resolving

iex> DNSCluster.Resolver.lookup("app.dev.acme.com", :a)
[{192, 168, 100, 1}]

iex> DNSCluster.Resolver.lookup("app.dev.acme.com", :aaaa)
[{0,0,0,0,0,65535,49320,25601}]

iex> DNSCluster.Resolver.lookup("app.dev.acme.com", :srv)
[{192, 168, 100, 101}, {192, 168, 100, 102}, {192, 168, 100, 103}]

iex> DNSCluster.Resolver.lookup("app.dev.acme.com", {:srv, :ips})
[{192, 168, 100, 101}, {192, 168, 100, 102}, {192, 168, 100, 103}]

iex> DNSCluster.Resolver.lookup("app.dev.acme.com", {:srv, :hostnames})
[~c"app-1.dev.acme.com", ~c"app-2.dev.acme.com", ~c"app-3.dev.acme.com"]

TODOs

  • Add tests
  • Update documentation

Fixes #14

We'll support connecting to hostnames soon too
@josevalim
Copy link
Member

Thank you for the PR! Given address_type :hostnames doesn't make sense for A/AAAA, perhaps we deprecate :srv and instead we have :srv_ips and :srv_hostnames instead? This way we keep compatibility with the resolver API.

@janpieper janpieper force-pushed the allow-connecting-to-hostnames-from-srv branch from 05b80c5 to d7ed718 Compare October 16, 2025 07:12
@janpieper
Copy link
Contributor Author

@josevalim I've updated the implementation and added :srv_ips and :srv_hostnames. The :srv resource type is still available, but marked as deprecated in the "Options" section. We could also log a warning or similar to tell users to use :srv_ips instead.

One could also think of keeping :srv as is and only add :srv_hostnames. This way we would not need to deprecate anything.

@josevalim
Copy link
Member

Hrm... sorry for the back and forth. What if we do {:srv, :ips} and {:srv, :hostnames} and we keep :srv as {:srv, :ips} then, with no deprecations?

@janpieper
Copy link
Contributor Author

sorry for the back and forth.

All good! 😉 It wasn't my plan to come with THE implementation. That's why the PR is still marked as a draft, to figure out what is the best way to go.

What if we do {:srv, :ips} and {:srv, :hostnames} and we keep :srv as {:srv, :ips} then, with no deprecations?

Isn't this the same as what is pushed currently, just with {:srv, :ips} and {:srv, :hostnames} instead of :srv_ips and :srv_hostnames? 🤔 Currently :srv will internally be mapped to :srv_ips. I only "marked" the :srv type as deprecated in the description for the resource_type description. There's no warning or similar.

I am fine with using tuples instead of atoms. In the end, both ways are the same.

@josevalim
Copy link
Member

Yes, they are the same, but I think it is more elegant to say :srv is shortcut for {:srv, :ips} then a completely different atom. It also makes it clear they are both using the same mechanism (srv) but in different ways.

@janpieper janpieper force-pushed the allow-connecting-to-hostnames-from-srv branch from d7ed718 to 31af08a Compare October 16, 2025 08:47
@janpieper
Copy link
Contributor Author

Pushed an update for using tuples instead of the atoms.

def lookup(query, {:srv, :hostnames}), do: lookup_by_name(query, :srv)
def lookup(query, {:srv, :ips}), do: lookup(query, :srv)

def lookup(query, :srv) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's normalize :srv into {:srv, :ips} during initialization, so dns_cluster/resolver.ex doesn't have to worry about all formats. I guess that's a backwards incompatible change, but they need to update anyway to support hostnames (and we are before v1.0)? Then we can ship it!

Copy link
Contributor Author

@janpieper janpieper Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way you would break custom resolver implementations as they would then also receive {:srv, :ips} instead of the expected :srv 🤔

Okay, I only read half of your answer before adding this comment 😅

@janpieper janpieper force-pushed the allow-connecting-to-hostnames-from-srv branch from 31af08a to 7bcfbd7 Compare October 16, 2025 10:39
@josevalim
Copy link
Member

Looks good to me!

@janpieper
Copy link
Contributor Author

@josevalim Would it be okay to add a mocking library (mock, mox, mimic, ...) to mock the calls for :inet_res? This would make things more easy, especially because currently the resolver is untested as it is fully replaced in the tests 🤔 If so, any library you prefer?

@josevalim
Copy link
Member

mimic is fine for unit testing!

@josevalim
Copy link
Member

Another option is to trim down the resolver interface to be resolve_dns, resolve_srv, so we can push as much as the logic possible to outside of the mocked environment. Then the resolver is a thin wrapper around inet_res (which is the thing you would mock anyway).

What do you think? In fact, I believe I would prefer this approach.

@janpieper
Copy link
Contributor Author

@josevalim I somehow have the feeling that this becomes more and more a refactoring 😅

Maybe we first make it work (with mocking) and rework it afterwards. WDYT?

@josevalim
Copy link
Member

@janpieper or maybe we refactor first and then we add the feature, given the feature itself is straight-forward?

Otherwise it feels backwards to add the mocking library simply to remove it after? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DNS discovery fails when RELEASE_NODE is FQDN

2 participants