This is similar to CSHARP-296.
The driver hangs when it faces a dead node entry on system.peers.
peer | rpc_address | schema_version | tokens | workload
------------------------------------------------------------------------------------------------------------------------------
54.217.2.68 | 10.241.5.19 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'372748112'} | Cassandra
54.170.8.129 | 10.37.13.231 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'113427455640312821154458202477628818595'} | Cassandra
46.51.18.106 | 10.106.15.242 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'56713727820156410577229101239000783353'} | Cassandra
54.146.21.231 | 10.61.24.2 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'113427455640312821154458202479064646083'} | Cassandra
54.92.16.11 | 10.13.3.19 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'141784319550391026443072753098378663704'} | Cassandra
54.184.18.2 | null | null | null | null
54.145.4.93 | 10.84.3.163 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'56713727820156410577229101240436610841'} | Cassandra
54.188.79.99 | 10.232.61.244 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'28356863910078205288614550621281390513'} | Cassandra
54.218.35.251 | 10.251.1.143 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'1967372893'} | Cassandra
54.155.58.36 | 10.72.10.89 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'85070591730234615865843651858314800974'} | Cassandra
54.82.9.3 | 10.157.49.3 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'85070591730234615865843651859750628462'} | Cassandra
54.196.20.39 | 10.47.18.17 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'28356863910078205288614550621122593220'} | Cassandra
54.220.5.128 | 10.106.13.185 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'141784319550391026443072753096942836216'} | Cassandra
54.214.134.121 | 10.237.30.159 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'113427455640312821154458202479223443376'} | Cassandra
54.203.95.16 | 10.217.218.42 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'56713727820156410577229101240595408134'} | Cassandra
54.160.18.190 | null | null | null | Cassandra
54.203.6.65 | 10.221.11.222 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'141784319550391026443072753098537460997'} | Cassandra
54.212.16.80 | 10.226.114.72 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'85070591730234615865843651859909425755'} | Cassandra
54.217.111.109 | 10.37.141.61 | 6bf198c0-228c-35b4-9e5a-c49d10a44963 | {'28356863910078205288614550619686765732'} | Cassandra
I also attached a patch that works for us (Netflix). However, note that we are using single token per node so there is no performance in returning tokens.
I think (coming in driver 2.1.10 and 3.0.1), should cause the driver to ignore these peers if there are null / missing columns that are expected.
Although I am curious what kind of behavior you are seeing. Does the driver completely stall or does it time out the node? Any thread dumps or stack traces may be helpful to understand the problem better. 2.2.0-rc3 is an old release candidate, so its possible that whatever issue you may be encountering might have been fixed in 3.0.0.
The dead nodes you are referring to are these two right?:
Hi Andy,
I think the patch in would cover this case. Sorry that I missed this and yeah, we are on an older 2.2 version.
Fixed by