Error including stack trace in client log when Cassandra node is brought down gracefully, even though request is successful

Description

Running a Cassandra 3.11 cluster and a client is using version 3.7.1 of the java-driver.

When 1 node in the cluster is brought down (gracefully) during traffic, the following error message (from the driver) is printed in the client log:

14:18:57.022 [cluster2-worker-32] ERROR c.d.driver.core.RequestHandler - Unexpected error while querying /10.61.204.11
com.datastax.driver.core.exceptions.ConnectionException: /10.61.204.11:12742 Pool is shutdown
at com.datastax.driver.core.HostConnectionPool.closeAsync(HostConnectionPool.java:662)
at com.datastax.driver.core.SessionManager.removePool(SessionManager.java:437)
at com.datastax.driver.core.SessionManager.onDown(SessionManager.java:525)
at com.datastax.driver.core.Cluster$Manager.onDown(Cluster.java:2004)
at com.datastax.driver.core.Cluster$Manager.access$1200(Cluster.java:1390)
at com.datastax.driver.core.Cluster$Manager$5.runMayThrow(Cluster.java:1959)
at com.datastax.driver.core.ExceptionCatchingRunnable.run(ExceptionCatchingRunnable.java:32)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)

But the request (from the client) is still successful (thanks to RF=3)!

Looking at the code in the RequestHandler (c.d.d.c.RequestHandler | stHandler$SpeculativeExecution$1 442)
is shows that the request is passed to the next host (findNextHostAndQuery)

Is it really necessary to log this at ERROR log-level with the full exception stack trace included, when we handle the failure by querying the next host?

Looking at the code in RequestHandler:

{{
public void onFailure(Throwable t) {
if (t instanceof BusyPoolException) {
logError(host.getSocketAddress(), t);
} else {
logger.error("Unexpected error while querying " + host.getAddress(), t); // <-- Logging with ERROR level WITH stacktrace
logError(host.getSocketAddress(), t);
}
findNextHostAndQuery();
}
...

private void logError(InetSocketAddress address, Throwable exception) {
logger.debug("[{}] Error querying {} : {}", id, address, exception.toString()); // <-- Logging with debug level WITHOUT stacktrace
...
}
}}

It feels like the log message in the onFailure-method should not be on error-level (maybe a warning?) and the stacktrace is not needed in that case.

In the logError-method we could include the stack trace since we are logging on debug-level.

Environment

Reproduce:able on a CCM cluster (3.11) using the java-driver example classes:
Modified CreateAndPopulateKeyspace to perform multiple inserts (10000),
and then stopping 1 node in the cluster during the inserting of data...

The error message was not logged every time, but quite often...

Pull Requests

None

Assignee

Unassigned

Reporter

Ted Petersson

Labels

None

PM Priority

None

Reproduced in

None

Affects versions

Fix versions

None

Pull Request

None

Doc Impact

None

Size

None

External issue ID

None

External issue ID

None

Components

Priority

Minor
Configure