Spark SQL cassandra integration cross cluster table join and write to another cluster

Description

Investigate and check current connector to support cross cluster table join for Spark SQL cassandra integration

Pull Requests

None

Linked issues

relates to

SPARKC-82

Evaluate core API changes to support cluster/keyspace/table level read/write/connection configuration

SPARKC-83

Refactor testing infastruture to support multiple cluster testing

Activity

Show:

Brian Cantoni May 5, 2015 at 9:20 PM

Reopen to re-resolve as Fixed for consistency.

Alex Liu April 7, 2015 at 11:46 PM

Example

val sc = new SparkContext(conf)
 val cc = new CassandraSQLContext(sc)
 val conf1 = new SparkConf(true)
       .set("spark.cassandra.connection.host", getHost(0).getHostAddress)
      .set("spark.cassandra.connection.native.port", getNativePort(0).toString)
      .set("spark.cassandra.connection.rpc.port", getRpcPort(0).toString)
  val conf2 = new SparkConf(true)
      .set("spark.cassandra.connection.host", getHost(1).getHostAddress)
      .set("spark.cassandra.connection.native.port", getNativePort(1).toString)
      .set("spark.cassandra.connection.rpc.port", getRpcPort(1).toString)
  cc.addClusterLevelCassandraConnConf("cluster1", conf1)
      .addClusterLevelCassandraConnConf("cluster2", conf2)
      .addClusterLevelReadConf("cluster1", conf)
      .addClusterLevelWriteConf("cluster1", conf)
      .addClusterLevelReadConf("cluster2", conf)
      .addClusterLevelWriteConf("cluster2", conf)

  val result = cc.sql("SELECT * FROM cluster1.sql_test1.test1 AS test1 Join cluster2.sql_test2.test2 AS test2 where test1.a=test2.a").collect()

  val insert = cc.sql("INSERT INTO cluster2.sql_test2.test3 SELECT * FROM cluster1.sql_test1.test1 AS t1").collect()

Brian Cantoni April 7, 2015 at 11:12 PM

@Alex Liu can you add some documentation/example here for the final implementation of this?

Jacek March 15, 2015 at 5:26 AM

@Alex Liu, I reviewed this PR, please take a look on my comments.

Alex Liu March 10, 2015 at 4:10 PM
Edited

This ticket is to make spark sql cross cluster join and write to another cluster work without core API changes.SPARKC-82 is created for core API changes evaluation later.

Fixed

Details
Assignee
Alex Liu(Deactivated)
Reporter
Alex Liu(Deactivated)
Reviewer
Jacek
Reviewer 2
Piotr Kołaczkowski
Pull Request
Components
SparkSQL
Fix versions
1.2.0
Priority
Major

Created March 2, 2015 at 4:25 PM

Updated May 5, 2015 at 9:20 PM

Resolved May 5, 2015 at 9:20 PM

Spark SQL cassandra integration cross cluster table join and write to another cluster

Description

Pull Requests

Linked issues

relates to

Activity

Brian Cantoni May 5, 2015 at 9:20 PM

Alex Liu April 7, 2015 at 11:46 PM

Brian Cantoni April 7, 2015 at 11:12 PM

Jacek March 15, 2015 at 5:26 AM

Alex Liu March 10, 2015 at 4:10 PMEdited

DetailsAssigneeAlex LiuAlex Liu(Deactivated)ReporterAlex LiuAlex Liu(Deactivated)ReviewerJacekJacekReviewer 2Piotr KołaczkowskiPiotr KołaczkowskiPull RequestComponentsSparkSQLFix versions1.2.0PriorityMajor

Details

Assignee

Reporter

Reviewer

Reviewer 2

Pull Request

Components

Fix versions

Priority

Alex Liu March 10, 2015 at 4:10 PM
Edited

Details
Assignee
Alex Liu(Deactivated)
Reporter
Alex Liu(Deactivated)
Reviewer
Jacek
Reviewer 2
Piotr Kołaczkowski
Pull Request
Components
SparkSQL
Fix versions
1.2.0
Priority
Major