AddressTranslater does not use listen_address when rpc_address is 0.0.0.0

Description

Hello, I'm having trouble to connect to nodes in an EC2 environment using the Ec2MultiRegionSnitch, where you must set broadcast_address=public_ip and listen_address=private_ip (rpc_address=0.0.0.0). The problem is, the driver machine tries to connect to the cassandra nodes via the public ip, but it only have access to these nodes via the private ip (due to Security Group restrictions). In fact, Amazon charges extra if you use the public IP within the same region.

So, similar to the Ec2MultiRegionSnitch behavior, where local IP is preferred, the driver must also use the private_ip within a local datacenter when available.

The AddressTranslater documentation says that:

Note that if the rpc_address of a node has been configured to 0.0.0.0 server side, then the provided address will be the node listen_address, not 0.0.0.0.

However, this behavior is not what's implemented as can be seen in the code snippet below (from core.ControlConnection.java):

ControlConnection.java

So, when the rpc_address is 0.0.0.0, ControlConnection must use the "preferred_ip" (listen_address) field from the peers table instead of the "peer" (broadcast_address) field.

Environment

None

Pull Requests

None

Activity

Show:
Olivier Michallat
November 20, 2014, 2:26 PM

Note that if the rpc_address of a node has been configured to 0.0.0.0 server side, then the provided address will be the node listen_address, not 0.0.0.0.

This refers to the server-side configuration (i.e. cassandra.yaml). The driver reads from system.peers, where that resolution is already done: if rpc_address was set to 0.0.0.0 in cassandra.yaml, system.peers.rpc_address will contain the value of listen_address.

That being said, not using private IPs for intra-region communications is indeed an issue. The Ruby driver handles this with reverse DNS lookups, we need to provide that as a new AddressTranslater implementation.

Olivier Michallat
November 25, 2014, 1:32 PM

There is an issue with the DNS lookup, it's also discussed in CASSANDRA-7431.

Olivier Michallat
December 16, 2014, 2:17 PM

Rescheduled to 2.0.10 due to urgent 2.0.9 bugfix release. The fix is available at https://github.com/datastax/java-driver/commit/832bc0f.

Olivier Michallat
January 12, 2015, 4:29 PM

: I've tested the reverse DNS lookup code manually on EC2. I'm not sure if/how we can set up an automated integration test for this.

Andy Tolbert
January 13, 2015, 6:12 AM

Validated in multi-region EC2 environment that using the EC2MultiRegionAddressTranslater properly translates public IP addresses in the local datacenter to private ones and that IP addresses in the remote DCs remain using public ips.

Using contact point "ec2-54-144-248-34.compute-1.amazonaws.com" running from an instance in us-east with EC2MultiRegionAddressTranslater, the resolved hosts are:

Using contact point "ec2-54-144-248-34.compute-1.amazonaws.com" running from an instance in us-east without a translater, the resolved hosts are:

Ran a remote debug session to ensure the translater was behaving as expected:

Local Node (shows private ip used)

Remote Node (shows public ip used)

Fixed

Assignee

Andy Tolbert

Reporter

Paulo Motta

Labels

None

PM Priority

None

Reproduced in

None

Affects versions

None

Fix versions

Pull Request

None

Doc Impact

None

Size

None

External issue ID

None

External issue ID

None

Priority

Major
Configure