Remove internal conversion of timestmap to DateTimeOffset

Description

We are using Cassandra.CSharp.Driver for about 6 months now and only with legacy cassandra tables (pre-CQL/thrift CF), keeping backward compatability with legacy code. We are constructing custom queries specifying .NET ticks as a record Timestamp

Right now I am studying Linq2Cql API and I was able to adapt thirft-like tables to it successfully using correct POCO mappings.

I also would like to continue to keep backward compatability with existing code and specify .NET ticks as a Timestamp automatically. I use custom written TimestampGenerator:

I wrote some tests and accidentally noticed that timestamp that is written is wrong. For example, instead of expected 636948894002383858, writetime(column) had a value of 1559292820678264 (values are only for reference).

Okay, let's fix TimestampGenerator with this:

Let's run some tests and we will get an exception:

Hmm, okay, let's find out what is happening in QueryProtocolOptions.Timestamp property:

Alright! My first version of code was correct, but it is just misleading, so I kept reading the code.

QueryProtocolOptions.Timestamp property is using to construct ExecuteRequest. Then ExecuteRequest's WriteFrame is called in RequestHandler where QueryOptions is begin written to a stream:

Voila! QueryProtocolOptions writing timestamp value from private field and not from using the Timestamp property:

These things are conflicting and making it impossible to use large .NET ticks for write queries using TimestampGenerator

Also, driver will convert timestamp to unix-like timestamp if I set the timestamp in an insert query (using .SetTimestamp) which is also not an obvious behaviour.

Overall, there are two main problems:

1. Driver is using wrong timestamp values when validating and writing
2. Driver is automatically converting timestamp which absolutely not obvious

I expect to have more control of timestamps

Looking forward to your thoughts. Thank you

  • Lev

Environment

None

Activity

Show:
Joao Reis
May 31, 2019, 11:16 PM

Ok it's debatable whether this is a bug or not (since it's documented that the timestamp should be in microseconds since epoch) but in any case we can do a very small improvement that will most likely fix your issue which is to remove any usage of the legacy Timestamp property internally. We can't change the property itself because it would be a breaking change but as long as you don't use it in your application code it won't try to convert the timestamp to a DateTimeOffset.

I can't say for sure yet if this improvement will fix it because I haven't done extensive testing to see what occurs when a TimestampGenerator returns Ticks.

We just had a release last week so we don't have a date for the next release yet but we will try to include this improvement in the next release.

Лев Димов
May 31, 2019, 11:32 PM
Edited

Ok it's debatable whether this is a bug or not

I agree with this.

remove any usage of the legacy Timestamp property internally

Yes I believe it will help us, because I can just return ticks from TimestampGenerator and QueryProtocolOptions.CreateFromQuery will use it directly. But if I set Timestamp right in a query (using .SetTimestamp() the driver implicitly converts ticks to microseconds since epoch. This silent behaviour doesn't seem good to me though I fully understand that this is now the legacy we cannot fix. I also see this as a problem because of how inconsistent the driver is using provided timestamps

it's documented that the timestamp should be in microseconds since epoch

Could you please provide a link to this? I cannot remember if this is a Cassandra's or driver's statement. I also don't understand why this should impose such a strict restriction on the driver.

We just had a release last week so we don't have a date for the next release yet but we will try to include this improvement in the next release.

Thank you a lot, Joao, I am looking forward to it.

Joao Reis
June 1, 2019, 12:05 AM
Edited

The CQL v3 protocol states:

0x20: With default timestamp. If present, <timestamp> should be present.
<timestamp> is a [long] representing the default timestamp for the query
in microseconds (negative values are discouraged but supported for
backward compatibility reasons except for the smallest negative
value (-2^63) that is forbidden). If provided, this will
replace the server side assigned timestamp as default timestamp.
Note that a timestamp in the query itself will still override
this timestamp. This is entirely optional.

DataStax documentation of CQL v3 also states:

TIMESTAMP epoch_in_microseconds
Marks inserted data (write time) with TIMESTAMP. Enter the time since epoch (January 1, 1970) in microseconds. By default, Cassandra uses the actual time of write.

The official apache cassandra documentation doesn't actually specify that the timestamp should be in microseconds, it just states that the default is microseconds from epoch.

But I guess you're not using CQL v3 protocol so my statement might have been incorrect, I'm not sure of what the documentation stated for the previous protocol versions.

From the driver's perspective, the api docs for ITimestampGenerator states:

Represents client-side, microsecond-precision query timestamps.

The driver's documentation page could be clearer on this subject but it does mention this:

In Apache Cassandra, each mutation has a microsecond-precision timestamp, which is used to order operations relative to each other.

I do agree that we should probably have a way for users to directly specify a long value which isn't converted by the driver into microseconds since epoch. We will evaluate this in the next major version at the same time that we remove this problematic Timestamp property. For now a custom TimestampGenerator should work since the return value is a long and no conversion is done by the driver when sending the request to the server (assuming we add the improvement to stop using the Timestamp property internally).

Лев Димов
June 1, 2019, 1:14 AM

I believe it is CQL v3 because we have a cluster of Cassandra 3.11.3 (I guess driver automatically choose protocol version according to cluster version).

Лев Димов
June 1, 2019, 5:24 AM

So I read some code, analyzed it with an IDE and made a PR https://github.com/datastax/csharp-driver/pull/455

I analyzed every external usings of the QueryProtocolOptions.Timestamp property and made sure it was used just for nullability checks. I also analyzed QueryProtocolOptions and found no usings of class in public API which tells us that such change would not be a "braking" one.

Fixed

Assignee

Unassigned

Reporter

Лев Димов

Labels

None

Reproduced in

None

PM Priority

None

Fix versions

External issue ID

None

Doc Impact

None

Reviewer

None

Pull Request

None

Epic Link

None

Sprint

C# P-NEXT

Pull Requests

None

Size

None

Affects versions

Priority

Major