repartitionByCassandraReplica relocates data to the local node only

Description

`repartitionByCassandraReplica` relocates data to the local(-host) node only, whereas it should relocate data over all the nodes of the local DC instead.

When running `repartitionByCassandraReplica` on a machine where:

  • a local node exists, then it relocates the entire data to this single node.

  • no local node exist (e.g. with nodes being on remote machines, or with a containerized node with an isolated IP on the local machine), then it always returns an empty RDD without throwing any exception.

Environment

None

Pull Requests

None

Activity

Show:
Jaroslaw Grabowski
April 7, 2021, 12:51 PM

thank you for finding and fixing this!

Fixed

Assignee

Yohann Rubinsztejn

Reporter

Yohann Rubinsztejn

Fix versions

Labels

None

Reviewer

None

Reviewer 2

None

Pull Request

None

Components

Affects versions

Priority

Major