We're updating the issue view to help you get more done. 

Driver is unable to correctly reestablish connection with previously decommissioned node

Description

Hello!

Recently we ran into a very strange driver behaviour.

After the return of decommissioned node, the driver starts to refresh Nodes status as expected following with an exception:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 2019-07-16 16:22:24,432 2222118 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.ControlConnection] Refreshing node list 2019-07-16 16:22:24,432 2222118 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.ControlConnection] Received Node status change event: host 10.217.11.94:9042 is UP 2019-07-16 16:22:24,432 2222118 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.ControlConnection] Received status change event for host 10.217.11.94:9042 but it was not found 2019-07-16 16:22:24,432 2222118 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.ControlConnection] Node list retrieved successfully 2019-07-16 16:22:24,432 2222118 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.ControlConnection] Retrieving keyspaces metadata 2019-07-16 16:22:24,448 2222133 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.ControlConnection] Updating keyspaces metadata 2019-07-16 16:22:24,448 2222133 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.MetadataHelpers.ReplicationStrategyFactory] Replication Strategy class name not recognized: LocalStrategy 2019-07-16 16:22:24,448 2222133 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.MetadataHelpers.ReplicationStrategyFactory] Replication Strategy class name not recognized: LocalStrategy 2019-07-16 16:22:24,448 2222133 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.ControlConnection] Rebuilding token map 2019-07-16 16:22:24,448 2222133 ERROR [T-1cd5933a(-)] [Cassandra] [Cassandra.Session] Exception while trying borrow a connection from a pool System.Net.Sockets.SocketException (0x80004005): A request to send or receive data was disallowed because the socket is not connected and (when sending on a datagram socket using a sendto call) no address was supplied at Cassandra.HostConnectionPool.<EnsureCreate>d__58.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Cassandra.HostConnectionPool.<BorrowConnection>d__36.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Cassandra.Requests.RequestHandler.<GetConnectionFromHost>d__37.MoveNext() 2019-07-16 16:22:24,463 2222149 INFO [T-1cd5933a(-)] [Cassandra] [Cassandra.ControlConnection] Finished building TokenMap for 47 keyspaces and 3 hosts. It took 15 milliseconds.

The exception itself is repeated every time the driver tries to execute a request on this node thus flooding logs with tons of errors.

Application restart resolves the error.

Also, the driver is still able to execute queries

Steps to reproduce:

  1. Get a cluster of 3 node: 1 DC, 3 Racks (1 node in each Rack)

  2. (Not sure if related, but in my case all keyspaces are with Replication Factor 3)

  3. Make sure that driver established at least one connection with every node (write/read data. Also not sure if related, but operations are executed with LocalQuorum)

  4. Execute node decommission while writing/reading data

  5. Make sure, driver removed decommissioned node (

    1 [Cassandra.Connections.HostConnectionPool] Host decommissioned. Closing pool #32185163 to <host_ip>:9042

    )

  6. Return the decommissioned node into cassandra ring (remove all data before joining)

  7. Wait for node to be joined

  8. The driver will start to throw exceptions

 

UPD: grammar

Environment

Cassandra Driver is used under .NET Framework 4.6.1 on Windows Server.

Pull Requests

None

Status

Assignee

Unassigned

Reporter

Лев Димов

Labels

None

Reproduced in

None

PM Priority

None

Fix versions

External issue ID

None

External issue ID

None

External issue ID

None

External issue ID

None

External issue ID

None

External issue ID

None

Doc Impact

None

Reviewer

None

Epic Link

None

Sprint

Size

None

Affects versions

3.8.0
3.10.1

Priority

Major