It is possible that if a host's last remaining connection fails during initialization that the host will not be marked down.
This is very unlikely to happen in practice and its impact would only be measured in very specific cases. If it were to happen, since the host is still up it would still be included in query plans. When the host would be chosen from a query plan and borrowConnection() is called, the driver would detect it under core connections and create new ones. However, if a user were hoping to be notified as soon as a host goes down, they would not know until a newly spawned connection fails and potentially never if that host is never chosen in a query plan.
Here's is an example of how it can manifest:
1. hostA, which is the current host a control connection is established to, is marked down.
2. The control connection immediately tries reconnecting and chooses hostB.
3. The control connection is opened to hostB and begins initializing (send STARTUP and request to validate clusterName).
4. hostB goes down and its pooled connections are closed. This leaves 1 remaining initializing connection (the control connection on hostB).
5. The control connection to hostB fails to initialize because its connection is reset while writing the clusterName validation request. ChannelCloseListener#operationComplete is called when we force the closeFuture. Since isInitialized is false, the connection doesn't get defuncted, but does get closed. Because of this signalConnectionClosed is called in closeAsync instead of signalConnectionFailure.
This may not always happen, since the connection is still defuncted in writeHandler, but that is not guaranteed to occur before or after closeFuture is notified. I can produce it rarely by doing the following:
1. Create a Cluster with a single contact point (HostA), and 2 nodes in the cassandra cluster.
2. Stop HostA and wait for down event.
3. Stop HostB.
4. Wait for down event on HostB. If the issue occurs, this will never happen.
We've decided to revert this ticket. The change caused another race condition, that could lead to marking a host down if protocol negotiation failed at startup. We're reverting to the previous behavior which is less severe (as explained above, the only impact is that a host could be marked down later than expected). The connection shutdown sequence will be reeaxamined to provide a better fix.