Revisit channel selection when pool size > 1

Description

DriverChannel.getAvailableIds() is currently based on a volatile variable. Picking a channel from the pool is a check-then-act, and not completely thread-safe in the sense that the in-flight count observed when the channel is picked might have changed by the time a stream id is assigned in InFlightHandler.

This appears to be a problem for high-traffic scenarios where the number of simultaneous requests is higher than max-requests-per-connection. If a large number of requests start at the same time, they might all select the first channel because its observed count is still 0. But by the time they request stream ids, some of them will get rejected and throw BusyConnectionException, while there was still room on another channel.

This can be easily reproduced in a constrained test scenario: max-requests-per-connection = 20, connection.pool.local.size = 2, then launch 40 client threads that synchronously execute the same request in a loop. Errors even persist when the number of connections is increased.

To mitigate this, we can go back to a model closer to 3.x: expose a fully thread-safe counter (atomic integer), that gets incremented as soon as the connection gets acquired. This guarantees that a stream id will be available if the acquisition succeeded.

Environment

None

Pull Requests

None

Activity

Show:

Resize issue view side panel

Fixed

Details

Assignee

Olivier Michallat

Reporter

Olivier Michallat

Fix versions

4.5.0

Priority

Major

Created February 9, 2020 at 11:41 PM

Updated February 14, 2020 at 7:07 PM

Resolved February 14, 2020 at 7:07 PM