All work
Fix generated timestamp on retry
Description
Environment
Pull Requests
Fix versions
Activity
Luc May 12, 2025 at 9:02 PM
Hello ,
I do agree with what you said and at first sight this approach seems not ideal. I added some modifications to the first code proposition, considering some constraints related to the nodejs driver, the usage of other drivers as you mentioned above, and to avoid introducing breaking changes:
Everything related to the timestamp attached to the request can be moved easily to a more appropriate place since we just call getOrGenerateTimestamp() on execOptions here.
As you mentioned here for the java driver, a custom retry policy can be used to change the timestamp. It should also be the case with the nodejs driver. At the time of writing, the only way to access and to potentially modify the timestamp without modifying the retry policy mechanisms of the driver is through operationInfo.executionOptions as defined here.
Looking at the python’s QueryMessage, it maintains the state of the timestamp through the lifetime of the query so we are assured that the timestamp will never be generated twice with a different value. execOptions is generated from queryOptions passed by the user at the request creation time and is shared by all executions for the same query. This is the flow of the timestamp for a query: get the timestamp from queryOptions, use this value in execOptions for future executions, or, if no timestamp is provided in queryOptions, generate a timestamp (one for each execution currently) in execOptions.
Every time, execOptions is used to access or generate a timestamp. There is no need to reference its state in another object (even in queryOptions as explained below) since only execOptions use it. execOptions can maintain its own state as required, and from a js perspective, methods defined on the instance prototype are there to manage the state of the object instance.queryOptions.timestamp should be used as an initialization value only and not as a state. Mutating queryOptions.timestamp through execOptions in a retry or when generating a timestamp can have a dramatic impact in the case where queryOptions is an object shared by multiple queries (the same object in different execute functions). Given a requests flow, some requests that are not executed yet can have their timestamp modified and not marked at the timestamp originally specified.
Concerning execOptions.getTimestamp(), is it relevant to return undefined when a timestamp generator is used since the state is now managed internally or is it necessary to keep track of user provided/generated timestamps?
I understand your desire to converge towards a unified structure with other drivers for simplicity but I think that we are not doing wrong on this driver, using apis as they should be used without introducing complexity and breaking changes
Bret McGuire May 8, 2025 at 11:38 PM
I think you’re on to something but I think we need to implement a fix at a different spot in the driver.
I’ll take a step back to explain my reasoning. Let’s look at a couple other drivers by way of comparison. The 4.x Java driver retry policy returns the request to be used on retry, although by default this is just the original request. The Python driver does something very similar. They’re able to do this because the driver (a) implements a “handler” object which governs the flow of a given request through it’s lifecycle and (b) maintains the message being sent as state within that handler… and this state includes the request timestamp. The Java driver manages request flow (including retries and speculative executions) via CqlRequestHandler while the message (and it’s timestamp) are represented as a discrete Java object (see the query message as an example). Python does something similar with different constructs; the ResponseFuture returned by exeucteX() calls manages the request lifecycle while request messages (and their timestamps) are encapsulated in Python objects.
The node.js driver does something similar… with an exception that’s pretty relevant to this discussion. There is a common request flow in RequestExecution. This logic looks very similar to the Java and Python examples but note that the stateful request contained here comes from the parent object, a RequestHandler. That request is created within Client but the crucial thing here is that it doesn’t track the request timetsamp. The serialization logic for that request is perfectly fine to let the timestamp be computed by the specified compute function… which gets into the logic you cite above.
It seems to me that the right way to fix this logic is to do something similar to what’s done in the other drivers and persist the timestamp as part of the underlying messages. I’d much rather make that change than modify how default timestamps are generated. This change shouldn’t have much of an impact on other driver functionality but we should definitely make sure we’re covered test-wise for any such change.
I’ll see if I can make this happen shortly… I completely agree with that the current behaviour is pretty clearly wrong on it’s face.
Bret McGuire April 27, 2025 at 8:55 PM
Thanks for the ticket and PR ! I’ll get these materials linked together and either or I will try to get a review for you shortly!
Hello ,
This is a duplicate of Github #438 for reference.
There is an issue with timestamps generated by timestamp generators in
DefaultExecutionOptions
.No matter if the request is prepared or not, any retry, including speculative execution, will generate a new timestamp for the query when starting a new execution for the same request object.
As interpreted by Cassandra, the latest timestamp take precedence. Then, queries executed and marked as successful can be overwritten by past queries when retrying the execution.
The user feels that it is a retry but in fact, it is a new query.
This is done by the
WriteQueue
when callingrequest.write(...)
which callwriteQueryParameters(...)
and thengetOrGenerateTimestamp()
which always generate a new timestamp for each new execution. It should generate one and use it for each consecutive calls to make a retry operation in Cassandra.I did not write any tests for this change since I don't have a proper setup and configuration on my side which cause tests to fail. I think testing on the client side only, like this is enough:
Thank you very much,
Have a good day,
Luc