Add AsyncEnumerable support to RowSet

Description

While looking at a trace of my application, I could see many threads spending a big amount of time in this stacktrace

I was surprised to see a ManualResetEventSlim.Wait when only the async interface of the Cassandra driver is used. It turns out that most mapper methods execute the statement asynchronously but read the row set synchronously. For example in FirstAsync, the Linq method First is used on the RowSet, that latter implements IEnumerable by synchronously waiting on the result fetch task.

This problem leads to thread starvation and is currently the biggest suspect to latency spikes in one of our business-critical application.

I would be interested to provide a fix but I’m not sure what would be the correct fix yet. One thing I can think of that would be future-proof is to make RowSet implement IAsyncEnumerable so we could replace all Linq methods used on the RowSet with async equivalent. However, that would be a breaking change and IAsyncEnumerable is only available on .NET Core 3+ so it requires some difficult decisions to make.

Environment

None

Activity

Grégoire Verdier 
November 1, 2023 at 3:14 PM
(edited)

It seems like this issue is also impacting the internals of the driver. For example, in TopologyRefresh.UpdatePeersInfo, a row set is enumerated, blocking the thread to fetch a new page.

EDIT: the row set probably fits in a single page for this query so we should be fine here.

Joao Reis 
August 16, 2023 at 5:11 PM

Agreed! In the below snippet could you confirm that’s it’s just a matter of removing the call to SetPageSize?

I can’t say for sure but I think removing that call will not address the issue because the automatic paging still happens if you enumerate the RowSet.

Good to know. So if I call several times this method it will load everything in memory or will it clear the previous page?

Yes but you need to update the pagingState in the Cql object (specifically in CqlQueryOptions). There’s an overload that takes a CqlQueryOptions object, one that takes a Cql object and one that receives the pagingState directly, all of these 3 overloads allow you to provide the paging state. You can retrieve it from the Page object that is returned by the method.

So something like:

Grégoire Verdier 
August 16, 2023 at 4:45 PM

What we can fix right now is thread blocking on result sets without any actual paging (which is the case for queries that return a “null” paging state) and this can be tracked in this current ticket.

Agreed! In the below snippet could you confirm that’s it’s just a matter of removing the call to SetPageSize?

You can use the FetchPageAsync() method of the mapper to perform manual paging

Good to know. So if I call several times this method it will load everything in memory or will it clear the previous page?

Joao Reis 
August 16, 2023 at 2:29 PM

I don’t think the Cassandra API should ever block a thread when querying data.

Yes, you’re right but IEnumerable doesn’t support async paging and IAsyncEnumerable requires new .NET targets to be added so for now we have to ask users to perform manual paging if they want to avoid thread blocking completely.

What we can fix right now is thread blocking on result sets without any actual paging (which is the case for queries that return a “null” paging state) and this can be tracked in this current ticket.

I could manually fetch the page but I’m not sure if I can still use the mapper class then.

You can use the FetchPageAsync() method of the mapper to perform manual paging, you don’t have to stop using the mapper for that.

Grégoire Verdier 
August 16, 2023 at 1:35 PM

That query can return a big number of rows. If it’s not returned in a single page, iterating over the row set will block the thread to fetch a new page. I could manually fetch the page but I’m not sure if I can still use the mapper class then.

After performing that query, all the rows are needed to perform an aggregation on them. But whatever my use-case is, I don’t think the Cassandra API should ever block a thread when querying data.

Details

Assignee

Reporter

Priority

Created August 8, 2023 at 6:55 PM
Updated April 16, 2025 at 1:51 PM