Hosts in 'UP' state missing from Query Plan for default LoadBalancingPolicy

Description

While running a test scenario that constantly suspects connections by setting a very low connection and read timeout, I found that the query plan for the Cluster lost all but 1 host over time (hours), despite the fact that all hosts were up.

For example, in the case below only 1 host was in the query plan. This continues indefinitely.

While this looks similar to , it's slightly different in that the hosts are not in a 'SUSPECTED' state. Host#isUp() will return true even if the host is suspected, so I did a heap dump and was able to see that host.state was 'UP' for all three hosts.

This is much more difficult to produce than .

Restarting a cassandra node associated with a host tends to get it back into a good state, as the socket connections are lost, triggering an onDown event which causes a reconnect event to occur, getting the host back into the LB policy.

Environment

None

Pull Requests

None

Activity

Show:
Andy Tolbert
December 20, 2014, 6:46 PM

After 17+ hours of a running scenario that typically manifests the issue in a few hours, I am not able to reproduce this with the fixes in the 2.0 branch.

Andy Tolbert
December 22, 2014, 8:17 PM

Completed validation against 2.0 and 2.1 branch.

Fixed

Assignee

Andy Tolbert

Reporter

Andy Tolbert

Labels

None

PM Priority

None

Reproduced in

None

Affects versions

Fix versions

Pull Request

None

Doc Impact

None

Size

None

External issue ID

None

External issue ID

None

Components

Priority

Major
Configure