Details
Assignee
UnassignedUnassignedReporter
Jaroslaw GrabowskiJaroslaw GrabowskiAffects versions
Priority
Major
Details
Details
Assignee
Unassigned
UnassignedReporter
Jaroslaw Grabowski
Jaroslaw GrabowskiAffects versions
Priority
Created April 22, 2022 at 2:30 PM
Updated April 22, 2022 at 2:32 PM
Currently the number of connections to “LOCAL” nodes defaults to the
number of cores - 1
. This is suboptimal for Astra where Spark executors are not colocated with DB nodes. In that case the default LoadBalancingPolicy is used and as a result all the “local DC” nodes are treated as LOCAL, for beefier machines this results in hundredths of connections opened from the Spark cluster to the Astra cluster. Every Spark executor opensnumber of cores - 1
connections to every Astra node.Furthermore, since all Astra connections go through a load balancer , all the connections pile up at a single host.
SCC should use a custom LoadBalancingPolicy that limits the number of connections to Astra. Only a single pool of
number of cores - 1
should be opened. Partitioning and data locality should be revisited.The default number of connection may be overridden with
spark.cassandra.connection.localConnectionsPerExecutor