All work
- $session->execute('INSERT ....') returns NULL instead of throwing an Exception, when passing an array as a parameterPHP-241
- PHP 8PHP-239
- Future of PHP DriverPHP-238
- New Tag for DataStax PHP Driver on GitHubPHP-237Resolved issue: PHP-237
- Cassandra PHP Driver hangsPHP-235
- Numeric types cannot perform cast or arithmetic operations on PHP 7.3+PHP-234
- Update Ubuntu and RHEL releasesPHP-233
- Update TravisCI and implement Jenkins CI support (excludes releases)PHP-232Michael Fero
- Limit PHP versions to officially supported PHP versionPHP-231Michael Fero
- Update Windows automated build script for PHP 7.2+ not working for ms stduio 2017 shows compilation errorPHP-230Resolved issue: PHP-230
- PHP driver relies on removed HAVE_SPL symbolPHP-229
- Release a new version via PECLPHP-228
- Prepared statements with named placeholdersPHP-227
- Update Windows automated build script for PHP 7.2+PHP-226
- Compile Warnings with PHP7.3PHP-225
- Cannot query in Keyspace with Uppercase charactersPHP-224
- DefaultSession option parameters incorrectly documentedPHP-223
- Fix RPM spec cpp-driver dependenciesPHP-222
- Allow for builds against PHP v7.xPHP-221Resolved issue: PHP-221
- PHP quick start example shouldn't use schema tablesPHP-220
- Mark Downgrading Consistency Retry Policy as deprecatedPHP-219
- Same timeuuid when creating it from timestamp for same timestampPHP-218
- Too many open connections in php-fpmPHP-215
- PHP driver fails with fatal error when printing Decimal object with negative scale attributePHP-214
- Remove reserved words from class names and functions, etc.PHP-212
- Require the local datacenter when contact points are specifiedPHP-209
- Support async queries with loop/promise implementationsPHP-208
- Update cluster naming for CCM feature and integration testsPHP-207
- Update CI builds to build the driver extension staticallyPHP-206
- Update build.yaml to include a connection testPHP-204Arun Chennadi
- Automate php-driver builds via adding a build.yaml and setting up jobs on jenkins.devtoolsPHP-203Arun Chennadi
- Adding documentation for binary installationsPHP-202
- Remove support for libuv v0.10.xPHP-201
- Update DSE README to contain relevant information from OSSPHP-200
- Investigate making all links work in both GitHub markup and DS docsPHP-199
- Merge DSE fixes to corePHP-198Resolved issue: PHP-198
- Support initializing TimeUuid from uuid stringPHP-197
- pecl install fails on debian jessyPHP-194
- Build the php driver for Ubuntu 16.04 with libuv1 as a dependencyPHP-193
- Cassandra\DefaultColumn::isReversed is deprecatedPHP-192Resolved issue: PHP-192Benjamin Roth
- event loopPHP-190Resolved issue: PHP-190
- Set of map : bad hash calculationPHP-189Resolved issue: PHP-189
- Merge fixes to documentation and examples from DSEPHP-188Resolved issue: PHP-188
- Merge duration type bug fix from DSEPHP-187Resolved issue: PHP-187
- Fix comment casing in yaml documentationPHP-186
- Remove {@inheritDoc} from method parametersPHP-184Resolved issue: PHP-184
- Docs: Add deprecated to ExecutionOptionsPHP-180Resolved issue: PHP-180
- DateRange constructor doc not rendered properly by doxygen/documentorPHP-178Resolved issue: PHP-178
- Set and Map unit tests should add more than one valuePHP-176
- type method on PHP classes should be staticPHP-175
Integrating Spark SQL Data Sources API
Description
Pull Requests
relates to
Details
Details
Assignee
Reporter
Reviewer
Reviewer 2
Components
Fix versions
Priority
Activity
@Catalin Zamfir I'm sorry you are disappointed by our release cycle, but I wanted to remind you this is an open-source project and everyone is encouraged to help. If you find some feature missing, you are free to contribute, either by writing code, documentation or doing reviews.
Yeah, it may have bugs, code may be ugly...
We keep pretty high standards and we are not going to release anything that we know is broken, could cause trouble to production users or would require us to revert in the next version. While shipping code fast with lower quality may sound good as a short-term strategy, this would hurt project in the long run. If you want to use cutting-edge, unstable code, you can still compile from source or even use things from topic branches.
BTW, the master branch worked with Spark 1.3.0 within a week or two since Spark 1.3.0 release, so this was not 2 months as you say, but I get the message we need to make the gap smaller and we'll work harder to improve.
It's now in master branch. Other related tickets are on their way.
All this while the community waits for a release. We've ended up coding our own, using the java driver and spark's .parallelize to pass the arguments of query to all nodes. Beats waiting on the connector ... Spark 1.3.0 was released March 13 and it's 20/05/2015 without a clear deadline in sight.
Personally I'm just tired almost daily watching for updates on these two tickets (just to avoid going non-standard). Took the decision to write our own about a month ago when no reaction came from here ... In fact, we've removed the connector entirely and rely on custom code.
I do hope in your project organization someone takes the time to do the full review and unblock these tickets for the sake of opportunity. It's a beer on me! Yeah, it may have bugs, code may be ugly but there's a window of opportunity the community expects especially from momentum technologies like Spark. 2 months after is not opportunity and most have custom-coded their own solutions for Spark 1.3.x/Cassandra 2.x connections.
After offline discussion, we make some decision here. Split the work into the following tickets.
1. This ticket only implements basic databsource API
2. SPARKC-162, Add keyspace/cluster level settings support
3. SPARKC-163, Replace CassandraRelation by CassandraResourceRelation for CassandraCatalog
4. SPARKC-135, Create a metastore to store metadata
5. SPARKC-140, Add custom DDL parser
6. SPARKC-137, Add table Creation command
7. SPARKC-129, Add custom data types
Spark SQL Data Sources API on this link https://databricks.com/blog/2015/01/09/spark-sql-data-sources-api-unified-data-access-for-the-spark-platform.html
Summary of the changes
This ticket implements Cassandra data source, and update CassandraCatalog to use it. It also creates a metastore to store meta data of tables from different data sources.
1. Cassandra data source
create a temp table as
sqlContext.sql( s""" |CREATE TEMPORARY TABLE tmpTable |USING org.apache.spark.sql.cassandra |OPTIONS ( | c_table "table", | keyspace "keyspace", | cluster "cluster", | push_down "true", | spark_cassandra_input_page_row_size "10", | spark_cassandra_output_consistency_level "ONE", | spark_cassandra_connection_timeout_ms "1000" | ) """.stripMargin.replaceAll("\n", " "))
drop a temp table as
sqlContext.dropTempTable("tmpTable")
save table to another table as
sqlContext.sql("SELECT a, b from ddlTable").save("org.apache.spark.sql.cassandra", ErrorIfExists, Map("c_table" -> "test_insert1", "keyspace" -> "sql_test"))
create datasource relation
override def createRelation( sqlContext: SQLContext, parameters: Map[String, String]): BaseRelation override def createRelation( sqlContext: SQLContext, parameters: Map[String, String], schema: StructType): BaseRelation override def createRelation( sqlContext: SQLContext, mode: SaveMode, parameters: Map[String, String], data: DataFrame): BaseRelation def apply( tableRef: TableRef, schema : Option[StructType] = None, sourceOptions: CassandraSourceOptions = CassandraSourceOptions())( implicit sqlContext: SQLContext) : CassandraSourceRelation