Revisit channel selection when pool size > 1

Description

DriverChannel.getAvailableIds() is currently based on a volatile variable. Picking a channel from the pool is a check-then-act, and not completely thread-safe in the sense that the in-flight count observed when the channel is picked might have changed by the time a stream id is assigned in InFlightHandler.

This appears to be a problem for high-traffic scenarios where the number of simultaneous requests is higher than max-requests-per-connection. If a large number of requests start at the same time, they might all select the first channel because its observed count is still 0. But by the time they request stream ids, some of them will get rejected and throw BusyConnectionException, while there was still room on another channel.

This can be easily reproduced in a constrained test scenario: max-requests-per-connection = 20, connection.pool.local.size = 2, then launch 40 client threads that synchronously execute the same request in a loop. Errors even persist when the number of connections is increased.

To mitigate this, we can go back to a model closer to 3.x: expose a fully thread-safe counter (atomic integer), that gets incremented as soon as the connection gets acquired. This guarantees that a stream id will be available if the acquisition succeeded.

Environment

None

Pull Requests

None

Assignee

Olivier Michallat

Reporter

Olivier Michallat

Labels

None

PM Priority

None

Reproduced in

None

Affects versions

None

Fix versions

Pull Request

None

Doc Impact

None

Size

None

External issue ID

None

External issue ID

None

Priority

Major
Configure