Fix behaviour of `ExecutionInfo#getQueriedHost` with Speculative retries

Description

ExecutionInfo#getQueriedHost() returns presumably the coordinator of a request by doing: triedHosts.get(triedHosts.size() - 1), but doing this would not be correct in a specific scenario with Spec retries.

For instance with a request that triggers a speculative retry but for which the result would come back first from a previous execution. The result getQueriedHost() would only return the last host tried but not the actual one with the result.

Simple scenario: request to node1, taking long and -> triggers spec Exec to node2 -> node1 responds just after. The list of tried hosts is [node1, node2] but ExecutionInfo#getQueriedHost() returns node2 instead of node1.

Environment

None

Pull Requests

None

Activity

Show:
Olivier Michallat
May 16, 2017, 4:30 PM

After looking into it some more, I don't think this can happen: each speculative execution has a current field that represents the node for the in-flight request. Upon retrying, current is added to triedHosts. When the request eventually succeeds (setFinalResult), we first cancel all other executions, and then take the current of the successful execution and add it at the end of triedHosts. Since other executions are cancelled at that point, I don't think there's any way they can add another node to the list.

One consequence of this algorithm however, is that in-flight nodes for other executions will not appear in triedHosts. For example:

  • initial execution start on node1

  • speculative execution starts on node2

  • speculative execution succeeds on node2

triedHosts will only contain node2, not node1, because the request to node1 had not completed yet. I think this is acceptable if we amend the javadoc to explain it.

Olivier Michallat
May 30, 2017, 6:35 PM

Addressed as part of JAVA-1460.

Kevin Gallardo
May 31, 2017, 2:47 PM

Hmm I think in this case the name triedHosts is quite deceiving if it doesn't contain all the tried hosts. Thanks for the explanations.

Kevin Gallardo
June 1, 2017, 2:23 PM

Looking again at your explanation and the code I don't understand how couldn't triedHosts contain the other hosts tried in the speculative executions.

I understand that when one of the speculative execution completes, the other ones get cancelled, but triedHosts still gets an new current host each time a new Speculative execution is triggered, and RequestHandler.setFinalResults() uses the triedHosts of the RequestHandler, which should normally contain all the hosts for all the Speculative executions triggered (since adding to this list happens in SpeculativeExecution.query() and it adds to RequestHandler.triedHosts).

In the code when we cancel a speculative execution I don't see a place where we remove the current host of the triedHosts list?

Fixed

Assignee

Olivier Michallat

Reporter

Kevin Gallardo

Labels

None

PM Priority

None

Reproduced in

None

Affects versions

Fix versions

Pull Request

None

Doc Impact

None

Size

None

External issue ID

None

External issue ID

None

Components

Priority

Major
Configure