Client doesn't use preferred_ip when connecting to nodes

Description

When using Cassandra 2.1.0 in EC2 VPC environment where each node has both public and private ips, the Datastax node.js driver always tries to connect with the public ip.

Example error:
Cassandra error { innerErrors:
{ '54.209.8.xx': 'Host considered as DOWN', '54.86.79.xx': 'Host considered as DOWN', '54.85.254.xx': 'Host considered as DOWN', '54.86.216.xx': 'Host considered as DOWN' },
message: 'All host(s) tried for query failed. First host tried, 54.209.8.xx: Host considered as DOWN. See innerErrors.',
query: 'INSERT INTO ...' }
cassandra error { innerErrors:
{ '54.209.8.xx': 'Host considered as DOWN',
'54.86.79.xx': 'Host considered as DOWN',
'54.85.254.xx': 'Host considered as DOWN',
'54.86.216.xx':
{ message: 'Connection timeout',
info: 'Cassandra Driver Error' } },

It seems that control-connection.js gets the list of cassandra nodes, but it doesn't try to get the prefferred_ip field. The query which control-connection uses is var selectPeers = "SELECT peer, data_center, rack, tokens, rpc_address FROM system.peers";

For reference, my system.peers table shows the preffered_ip field, so the driver could get and use that.

cqlsh> SELECT peer,data_center,rack,rpc_address,preferred_ip FROM system.peers;

peer | data_center | rack | rpc_address | preferred_ip
---------------------------------------------------------------
54.86.79.xx | us-east | 1e | 54.86.79.9 | 172.xx.xx.1
54.84.113.xx | null | null | null | 172.xx.xx.2
54.209.26.xx | us-east | 1d | 54.209.26.196 | 172.xx.xx.3
54.85.254.xx | us-east | 1b | 54.85.254.239 | 172.xx.xx.4
54.86.216.xx | us-east | 1d | 54.86.216.194 | 172.xx.xx.5
54.86.187.xx | null | null | null | 172.xx.xx.6
54.86.191.xx | null | null | null | 172.xx.xx.7
54.86.156.xx | null | null | null | 172.xx.xx.8
54.88.4.xx | us-east | 1e | 54.88.4.194 | 172.16.5.9
54.86.171.xx | null | null | null | 172.xx.xx.10

Environment

Cassandra 2.1.0, Ubuntu 14.04, node.js v0.10.32, Ec2MultiRegionSnitch

Pull Requests

None

Affects versions

Fix versions

None

Activity

Show:

Paulo Motta 
October 23, 2014 at 12:31 PM

I'm also facing the same issue in an EC2 and I'm not satisfied with the "wont't fix".

"If you want private ip within the dc and public outside the dc, it is not possible to achieve it without an address translator." - I don't think the driver user must be responsible for manually doing an Address translator, when that is the DEFAULT behavior with the EC2MultiRegionSnitch: "For intra-region traffic, Cassandra switches to the private IP after establishing a connection. - http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2MultiRegion_c.html". Even when not using the Ec2MultiRegionSnitch, it's possible to configure the prefer_local flag to use private IP addresses (https://issues.apache.org/jira/browse/CASSANDRA-5630).

So, if everywhere else it's possible to prefer private IPs for intra-DC traffic, why this is not the default behavior on the client driver?

Jorge Bay Gondra 
October 22, 2014 at 8:30 AM

If you want private ip within the dc and public outside the dc, it is not possible to achieve it without an address translator.

I've created for the address translator that is a common solution in the DataStax drivers.

Juho Mäkinen 
October 20, 2014 at 5:24 PM
(edited)

So I currently have this (or have tried):
listen_address: 0.0.0.0, empty or the machine local/private ipv4 address
rpc_address: 0.0.0.0
broadcast_rpc_address: local/private ipv4 address (say 10.0.0.1)
broadcast_address: public ipv4 address

None of this helps on this issue.

The problem is that the listen_address nor the private ip wont be used as the rpc_address. The "peer" field in system.peers is always set to "broadcast_address", which is the public ip (just as you pointed out), so the only way how the driver can connect to the private ip is that the rpc_address field in the system.peers is set to the private field. I believe that because of how EC2MultiRegionSnitch works, rpc_address is always set to the public ip.

I went around the source code and I think I can here prove it that this is because how the EC2MultiRegionSnitch works:

1) When EC2MultiRegionSnitch starts it gets the machine public ip via the Amazon meta-data api (see PUBLIC_IP_QUERY_URL). This public ip is saved with DatabaseDescriptor.setBroadcastRpcAddress(localPublicAddress); (see file https://github.com/apache/cassandra/blob/cassandra-2.1.0/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java#L55)

2) The setBroadcastRpcAddress stores the address into local broadcastRpcAddress variable. This variable previously hold the broadcast_rpc_address overwrites whatever has been set in cassandra.yaml in broadcast_rpc_address variable.

3) The local DatabaseDescriptor local variable broadcastRpcAddress is later read by StorageService.prepareToJoin() which sets this into an appStates HashMap, which is used in the Gossip protocol.

4) The new state reaches StorageService.onChange() which updates the "rpc_address" field via SystemKeyspace.updatePeerInfo(). This is then the value which the nodejs driver reads and uses as its address when connecting to the Cassandra nodes. Only if rpc_address is 0.0.0.0 then the driver uses the peer address to connect, but this is irrelevant because the rpc_address wont ever be 0.0.0.0 (at least not in this case).

So another discussion is that how and where does preferred_ip come from? Its set by the OutboundTcpConnectionPool.reset() method, which uses SystemKeyspace.updatePreferredIP() which then stores the preferred ip into system.peers. The reset() method is called by ReconnectableSnitchHelper class, which documentation says the following: "Sidekick helper for snitches that want to reconnect from one IP addr for a node to another. Typically, this is for situations like EC2 where a node will have a public address and a private address, where we connect on the public, discover the private, and reconnect on the private."

So it seems that the preferred_ip is made just for this kind of situation and it's especially made in mind so that EC2MultiRegionSnitch could work with clients. To my ear, that's a pretty safe bet to use on client connection when it's available.

What do you think?

Jorge Bay Gondra 
October 20, 2014 at 2:00 PM

I don't understand the problem, with the EC2MultiRegionSnitch you must set the listen_address to the private IP address of the node, and the broadcast_address is set to the public IP address of the node.

You can use a load balancing policy (dc aware load balancing would be good enough) to set the distance of the other region nodes to remote, this way the driver wont try to connect to those nodes.

Juho Mäkinen 
October 20, 2014 at 1:05 PM

I already have broadcast_rpc_address set to the private ip in the cassandra.yaml. It seems that the Ec2MultiRegionSnitch reconfigures the rpc address to use the public ip so that the cassandra nodes will use the public ips to communicate across the regions. (see https://github.com/apache/cassandra/blob/cassandra-2.1.0/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java#L55). Either that, or then something else is broken.

Now I don't know how or where the preferred_ip is populated in the cassandra code, but it would seem trivial to query also that from system.peers and use it if it's not null instead of the rpc_address.

Won't Fix

Details

Assignee

Reporter

Original estimate

Time tracking

No time logged4h remaining

Priority

Created October 20, 2014 at 12:46 PM
Updated October 23, 2014 at 12:37 PM
Resolved October 22, 2014 at 8:30 AM