NPE using fetchMoreResults

Description

While chasing another problem using fetchMoreResults, I uncovered a NullPointerException:
Caused by: java.lang.NullPointerException
at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
at com.datastax.driver.core.ArrayBackedResultSet.fetchMoreResultsBlocking(ArrayBackedResultSet.java:195)
at com.datastax.driver.core.ArrayBackedResultSet.one(ArrayBackedResultSet.java:148)

This appeared on a single partition (no IN constraint on the composite primary key), reading rows from the partition in reverse order, from right to left.

It appears that with enough testing I was able to hit exactly the window where the outstanding prefetched results finished just as the primary thread was reading to the end of the current set of results. fetchMoreResultsBlocking() invokes fetchMoreResults(), where one sees the following lines:

if (fetchState.inProgress != null)
return fetchState.inProgress;

My hypothesis is that this thread sees that fetchState.inProgress is non-null, the asynchronous thread stores the new completion response, in which nextStart is non-null and inProgress is null, and this thread then returned null as the future.

The fix is to establish a single local reference to fetchState for the duration of the method, guaranteeing a consistent state, or by changing this decision to fetch the inProgress future only once:

ListenableFuture<Void> inProgress = fetchState.inProgress;
if (null != inProgress)
return inProgress;

Environment

DataStax Cassandra 2.0.7
Single node cluster
Windows dual core laptop

Pull Requests

None

Activity

Show:
Sylvain Lebresne
May 2, 2014, 5:41 PM

I think your analysis is spot on, and the suggested fix looks good to me so committed that. Thanks.

William Mitchell
May 4, 2014, 2:00 PM

Looking at the code again, now that I understand fetchState a little better, I think there is still a race condition at the end of the result set.

If I understand the protocol correctly, on the last result set, the protocol returns a paging state of null. Imagine the consuming reader of the result set has already issued a fetchMoreResults once, and the response for this is in the middle of being processed, and represents the final response of the query result. Now suppose the consuming thread issues fetchMoreResults again, either directly or by calling one() after the final row and entering by way of fetchMoreResultsBlocking().

This primary thread may test isFullyFetched() and find fetchState is non-null, meanwhile the response thread updates the fetchState to null to reflect that no more rows are available, then the primary thread tries to access fetchState.inProgress and receive an NPE.

It seems to me we need the slightly larger change, to grab the state at the beginning:

Sylvain Lebresne
May 5, 2014, 8:02 AM

You're right, isFullyFetched() also check the fetchState. Fixed that too then, thanks.

Assignee

Sylvain Lebresne

Reporter

William Mitchell

Labels

None

PM Priority

None

Reproduced in

None

Affects versions

Fix versions

Pull Request

None

Doc Impact

None

Size

None

External issue ID

None

External issue ID

None

Components

Priority

Major
Configure