Add some flexibility in Node Discovery

Description

Relying only on the peers table for node discovery seems to be problematic as some Cassandra users were for instance using 0.0.0.0 as rpc_address and a private IP as listen_address.

We need to introduce more flexibility for this node discovery, either on Cassandra side or on the driver side.

Environment

None

Pull Requests

None

Activity

Show:

Sylvain Lebresne March 18, 2014 at 4:48 PM

Revisiting this issue, and more generally on the issue of making node discovery more flexible, I've done 2 changes:

  1. it's already relatively to write a load balancing policy that only connects to a fixed set of nodes (and if said fixed set is the contact points, then that basically mean no auto-discovery), yet we can probably provided such policy out of the box, and so I've already pushed a new WhiteListPolicy to do so.

  2. There is situation a bit more complex where it's not so much that you don't want node discovery, but rather that the IP addres the driver should use are not directly the ones configured server side. For instance, in EC2, nodes may want to use the private IP to connect to nodes in the same region but the public IP for remote nodes (in that case in particular, the choice of which IP to use is a client side one, no amount of configuration server side would help). Or maybe the servers are behind a router and you don't access the C* nodes directly. To handle those cases, I've added a new (entirely optional) AddressTranslater interface that users can implement, which allow to implement any possible custom scheme to translate a server IP to an IP queriable by the driver.

I think those pretty much solves the concerns expressed on this ticket so closing. If you still think there is situation that the driver doesn't handle with those but should, feel free to open a separate ticket with details.

Sylvain Lebresne September 2, 2013 at 1:57 PM

I think this issue is getting a bit off the track. Let me try to see if I can clarify a bunch of things:

  1. the driver absolutely does allow you to limit the C* nodes to which it connects to. This is basically controlled by the LoadBalancingPolicy.distance method and it's quite flexible. It would be trivial to write an implementation that ignore nodes (as in, not create connections to them) that are not in some fixed set of addresses. And you can even do more interesting things (imo) like making sure you only connect to node in a "local" datacenter (or AWS availability zones or whatnot), which DCAwareRoundRobinPolicy does out of the box btw. Now, it's true that there is no simple shortcut method that limits the nodes used to the fixed list of contact points, but that's because imho, that's almost never optimal and not something worth promoting as "good practice". It's definitively possible and not very hard however.

  2. the driver fetches/receives a number of informations about C* nodes from C*. Those informations includes the IP of new nodes joining the cluster, but also the DC and tokens of each nodes and more. This is used for a variety of things. It's more general than "node discovery" in practice: it's the ability of the driver of using node information from C*, so let's call it "node-infos discovery". It is that "node-infos discovery" mechanism as a whole that can be confused by the use of 0.0.0.0 as rpc_address. But as said above, while this is how the driver learn about new nodes, the fact of connecting to a known node is a separate concern.

  3. the driver is not always confused if 0.0.0.0 is used. If the C* nodes only have 1 physical network interface (and 0.0.0.0 is just used out of some convenience), then the driver will use the value of listen_address (which can't be 0.0.0.0) and in that case, this will be the right thing. So the problem is only when you use 0.0.0.0, have at least 2 network interfaces and do not intend client traffic to go to listen_address. Without saying this is not worth fixing if we can, I do am questioning that configuration as being wise in the first place: if you don't want client traffic to go to listen_address, why use 0.0.0.0? It's like asking for clients to violate the rule you're trying to enforce.

  4. In those cases where 0.0.0.0 does confuse the driver, the presence of 1 (the ability to connect to only a fixed contact points) does not "solve" 2 (the "node-infos discovery"). Even if you only ever connect to the contact points, the driver still won't be able to properly fetch tokens or datacenter informations for instance, which will cripple some features. And so, when the driver suspect we might be in that case, it issues a warning. What I do am against is to add some option to disable that "node-infos discovery" mechanism entirely: I just don't see what disabling it would buy anyone and I don't want to pretend that it's meant to be optional.

  5. Now, should the driver be somewhat resilient to some of it's part not working correctly? Sure! And in theory, the driver should still work "relatively" correctly (as in, you'd be able to do queries) with the contact points even it can't auto-discover more. If someone observe otherwise, please do open a ticket. I'm always happy to make the driver be as resilient to failure cases as possible. But it's still consider it a failure case.

Alex Popescu August 30, 2013 at 4:56 PM
Edited

I feel like we are discussing multiple topics under the umbrella of this ticket.

1. Allowing C* nodes to accept client traffic on multiple interfaces (basically the rpc_address 0.0.0.0). For dealing with this scenario Sylvain has already created https://issues.apache.org/jira/browse/CASSANDRA-5899.

From a driver perspective the only implication I can see is that in auto-discovery mode it will have to try multiple addresses before completing the "setup". Dealing with this (and the case where a client should never send traffic to specific nodes) could be done using a custom LoadBalancing implementation.

2. A non-auto-discovery mode for the drivers. It is this part that I'm not sure I understand when it would be used.

Russell Bradberry August 30, 2013 at 4:34 PM

One use case is Amazon. Currently AWS doesn't charge for data transfer across private IP addresses. So it makes sense that all C* servers should talk across their private IP addresses to ensure that the user isn't charged for data transfer between nodes. Such deployments may also call for accessing Cassandra from a different availability zone (or a different cloud provider) that would require using the public IP. Since the private IP is broadcast to the auto-detect in this case, there will be issues with connecting.

The other scenario is in development environments. Sometimes as a developer I will have the need to connect to a deployed cluster to test assumptions, this again causes the same issue as above.

While technically you can provide all the external hostnames as part of the connection object and it will use those, when I tried it (with the python variant) , I had to wait for every attempted connection to the private ip addresses to timeout before it would continue along the code path. I'm not sure if the Java version of the driver exhibits this behavior or not.

One would expect that if auto-detect is turned off, so will all the features that come along with it like notifications etc. This would mean that the driver will act just as all current thrift-based CQL drivers do today.

Alex Popescu August 30, 2013 at 4:16 PM

while I can see the argument for an "expert mode", what I'd like to understand is: 1) what scenarios would this setup serve; and 2) aren't there better ways to address those.

(Sylvain has already explained that the non-auto-discovery mode would lead to issues with cluster notifications and that by itself seems problematic already.)

Fixed

Details

Assignee

Reporter

Fix versions

Priority

Created July 31, 2013 at 7:52 PM
Updated May 6, 2014 at 1:00 AM
Resolved March 18, 2014 at 4:48 PM