Delay seen while querying a table with prepare set to true for first time when one node is down

Description

I was running some tests to query data from cassandra cluster and I notice small delay when a node is down to query data. I tested with both version 3.5 and 4.3.1 drivers. The delay is seen only when prepare flag is set to true in query options and when a table with data is queried for the first time. Is this delay expected or is this an issue?

This is how I tested.

  1. Have a cassandra cluster with 3 nodes and each node running in one VM.

  2. Ran a script to create a keyspace (simplestrategy and replication factor of 3) and 10 tables and few records to each table.

  3. Ran a script to query data (prepare=true query option) from each table and everything worked fine without any delay as all nodes are up.

  4. Brought one cassandra node down by shutting down the VM.

  5. Ran the script to query data and I see delay of around 3 seconds when a table is queried for first time. The delay is seen only when prepare=true is set.

Below is the link to my scripts and output of my tests https://drive.google.com/drive/folders/1_JLnyFzGtNE8DtvDhIJ9jEb4lhgCspbC?usp=sharing

Thanks Sarath

Environment

None

Pull Requests

None

Activity

Show:
Jorge Bay Gondra
November 27, 2019, 8:03 PM

Can you identify if the delay is made when calling connect() method or when calling execute()?

Jorge Bay Gondra
November 27, 2019, 8:05 PM

It can also be helpful to start logging before running connect() (move it one line above).

sarath ambadas
December 5, 2019, 5:33 AM

The delay is happening in both places. Added the new logs with both 4.3.1 driver and 3.5 driver. I am just calling execute in a loop.

Link to logs and test code :

Jorge Bay Gondra
December 5, 2019, 8:17 PM

Thanks to all the info you provided, I was able to reproduce it.

I’ve summarized the underlying cause in NODEJS-584.

This is unlikely to occur in production as query preparation occurs once in the application lifetime. Once query preparation ends, the latency spike will no longer happen and reconnection attempts will be made in the background.

If you use fewer number of different queries and/or a higher number of executions, you should see latency stabilizing. For example, use 6 different queries/tables and 50 executions, after the first 6 queries, the response time should be predictable.

 

Thanks for your dedication providing so detailed info!!!

I’ll close this ticket, you can follow the development for more info.

 

Jorge Bay Gondra
December 5, 2019, 8:20 PM

Additionally, if you want to avoid this issue altogether, you can set prepareOnAllHosts to false in your client options:

 

Duplicate

Assignee

Unassigned

Reporter

sarath ambadas

Reviewer

None

Fix versions

None

Labels

None

Components

PM Priority

None

Reproduced in

3.5.0
4.3.1

Pull Request

None

Affects versions

Priority

Major