Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects versions: 2.7.3
    • Fix versions: None
    • Components: None
    • Labels:
      None
    • Environment:

      C* 2.1.13.

      • 11 nodes in two DCs
      • 30 keyspaces all with same RFs (NetworkTopologyStrategy)

      csharp-driver 2.7.3

    • Sprint:

      Description

      We recently added 5 nodes to an existing 6 node cluster in a new datacenter. From this time, the initial connection to the cluster takes upwards of 2 minutes. Once initialized, everything is fine. As we dug into this delay we have narrowed it down to the initialization of the TokenMap in `TokenMap.Build`

      We see 3 loops nested:

      • for each keyspace (30) - ComputeTokenToReplicaNetwork:
        • for 0..numTokens (256*11 = 2816)
          • for 0..numTokens (2816)

      This set of nested loops causes 30*2816*2816 = 237,895,680 iterations at the inner loop in `ComputeTokenToReplicaNetwork` (minus any early continues or break conditions)

      Profiling shows this is essentially a tight processing loop consuming a single core the entire time. We are concerned with this because while 2 minutes may be acceptable, we will very likely be adding more datacenters and more nodes and more keyspaces. With the exponential complexity of this method, I'm not sure how this will work.

      From looking at the Java driver, it appears they have an optimization where they compute this once for each distinct value of Replication Strategy (combination of DC:<rf>) Here is the code. In our case, this reduces the complexity by a factor of 30 (all our keyspaces use the same RF):

      Map<ReplicationStrategy, Map<Token, Set<Host>>> replStrategyToHosts = new ...();
      Map<String, Map<Host, Set<TokenRange>>> hostsToRanges = new ...();
      for (KeyspaceMetadata keyspace : keyspaces) {
         ReplicationStrategy strategy = keyspace.replicationStrategy();
         Map<Token, Set<Host>> ksTokens = replStrategyToHosts.get(strategy);
         // only if hasn't been calculated for this unique replication strategy //
         if (ksTokens == null) { 
            ksTokens = (strategy == null)
               ? makeNonReplicatedMap(tokenToPrimary)
               : strategy.computeTokenToReplicaMap(keyspace.getName(), tokenToPrimary, ring);
           replStrategyToHosts.put(strategy, ksTokens);
         }
      

      Is this optimization something we could add to the csharp driver? It would help a lot in our case, though there is still the exponential hit of adding more nodes. Is this not an issue for others? I find it hard to believe my cluster is among the most complicated out there...

      Thanks in advance!

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              thunderstumpges Thunder Stumpges
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: