Speculative query retries

Description

Cassandra has implemented speculative retries in 2.0.2 to improve the overall latency in high percentiles. The driver should implement a similar mechanism in order to offer smoother latencies even while experiencing some node hiccups.

Resources:
http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2
https://issues.apache.org/jira/browse/CASSANDRA-4705
https://issues.apache.org/jira/browse/CASSANDRA-5932

Environment

None

Pull Requests

None

Activity

Show:
Michaël Figuière
April 16, 2015, 3:09 PM

A fair simplification would be to avoid speculative retries with any SimpleStatement as we already can't do token aware balancing for these statements. The idea has always been that non-prepared statements get less optimizations, besides the lack of pre-compilation.

Olivier Michallat
April 16, 2015, 4:34 PM

I've added support for idempotency (see the pull request).

By default, isIdempotent is inferred for BuiltStatements, and false for all other statements. You can override it at the statement level.

There is also an option to change the default, i.e. make all regular statements idempotent by default. This is useful if you know you'll never use counters or list appends in your app, and want to avoid the hassle of setting the flag on each statement.

Andy Tolbert
April 24, 2015, 4:19 PM

Verified the following:

  • Speculative executions are scheduled and executed if the previous execution(s) have not completed.

  • Ensured that futures are tied to an execution are updated when a subsequent execution is completed first.

  • Tested with low delays (less than the mean response time, 10-100ms) to stress speculative executions executing near or around their previous execution. Identified a bug where the stream id generating for the 2nd execution was used for the 1st causing responses to set on requests that don't match up, which was since been fixed.

  • Executed long running test against a heavily strained cassandra cluster (cpu bound). Tested with ConstantSpeculativeExecutionPolicy with 800ms delay (~98-99th percentile response times) and 3 executions. Observed that speculative executions were submitted at ~2% rate of requests.

Tali Ben-Meir
May 6, 2015, 7:19 AM

When will this feature be merged into 2.1.x driver code?

Olivier Michallat
May 6, 2015, 7:33 AM

this will be in 2.1.6, which will most likely be dedicated to merging the changes from 2.0.10 (with no new 2.1 features). I'll make an announcement later this week on the google group.

Fixed

Assignee

Olivier Michallat

Reporter

Michaël Figuière

Labels

PM Priority

None

Affects versions

None

Fix versions

Pull Request

None

Doc Impact

None

Size

None

External issue ID

None

External issue ID

None

Components

Priority

Major
Configure