JAVA-727: Allow monotonic timestamp generators to drift in the future + use microsecond precision when possible.

Description

The monotonic timestamp generators introduced in 2.1 provide monotonically increasing timestamps. However, if there are more then 999 writes/deletes in a single millisecond, the same timestamp is given repeatedly until the millisecond passes the current value and the following statement is logged repeatedly:

Sub-millisecond counter overflowed, some query timestamps will not be distinct

This is normally ok, as generating 1000 statements in < millisecond is not a common use case.

The problem when time is set back an arbitrary period of time, such as during a leap second when the 23:59:59 second is repeated at midnight. 1000 writes in a single second is definitely something you could expect.

The danger is if you have multiple write operations for the same column using the same timestamp, the writes are no longer atomic. The problem that arises when using the same timestamp for a write operation on the same column(s) is described in this comment in CASSANDRA-6106.

If there were some way to prevent this situation, like advancing the millis value and resetting the counter like in the 'else' case in AbstractMonotonicTimestampGenerator we would be protected from this problem. However, there might be implications to advancing the millisecond as we may drift our timestamps out into the future, which could be bad. There are some strategies discussed in CASSANDRA-6106.


Resolution:
Monotonic timestamps generators (AtomicMonotonicTimestampGenerator and ThreadLocalMonotonicTimestampGenerator) now drift in the future when more than 999 values are generated per microsecond. The default implementations will log a warning when that happens (use the constructors with arguments to control how often).

It's also possible to implement generators that do something else when the timestamps drift, by extending AbstractMonotonicTimestampGenerator and implementing onDrift.

Environment

None

Pull Requests

None

Attachments

1

Activity

Show:

Andy Tolbert April 26, 2015 at 3:36 AM
Edited

Attached

which demonstrates behavior where overflow of millisecond counter causes most recent write for a column to not be returned as cassandra breaks ties by returning the highest value for that column.

Should fail with output:

Andy Tolbert April 24, 2015 at 10:03 PM

I'll attach a test case for this later today. An easy way to overflow programmatically would be to manipulate the clock in AbstractTimestampGenerator.

Fixed

Details

Assignee

Reporter

Reproduced in

Affects versions

Fix versions

Components

Priority

Created April 24, 2015 at 10:02 PM
Updated January 5, 2021 at 10:02 AM
Resolved March 9, 2016 at 9:29 PM