Provide a "true" synchronous API that doesn't wrap around any async methods

Description

We have been doing some 200 concurrency performance test on our ASP.Net Core application which is using synchronous controller actions to read data from Cassandra using the 3.13.0 version of the Datastax C# library. We are finding that calling the synchronous operations of the driver is resulting in client timeouts due to thread starvation.

Within the source code, we see that the synchronous operation is actually performing an async operation and then blocking the thread. This is spawning a huge number of unnecessary threads as the function is expected to execute synchronously on the thread in which the function is called:

/// <inheritdoc /> public RowSet Execute(IStatementstatement, string executionProfileName) { var task = ExecuteAsync(statement, executionProfileName); TaskHelper.WaitToCompleteWithMetrics(_metricsManager, task, Configuration.DefaultRequestOptions.QueryAbortTimeout); return task.Result; }

 

Here is an example Console application that was put together to test the functionality to be able to reproduce the error at the lowest level. You can imagine that the Task.Run function shown here is similar in functionality to how the IIS/.Net thread pool runs synchronous controller actions where each user would normally be a different user calling our web api.

static void PerformDataStaxDriverTest3(string serverNodes, int port, string username, string password, string keyspace, int start, int end) { string[] cassandraNodesAsArray = serverNodes.Replace(" ", "").Split(","); var metricsRoot = new MetricsBuilder().Report.ToConsole().Build(); ICluster cluster = Cluster.Builder().AddContactPoints(cassandraNodesAsArray).WithPort(port) .WithCredentials(username, password).WithMetrics(metricsRoot.CreateDriverMetricsProvider()).Build(); ISession session = cluster.Connect(keyspace); IList<Task> tasks = new List<Task>(); for (int i = start; i < end; i++) { var task = Task.Run(() => { var watch = Stopwatch.StartNew(); RowSet rows = session.Execute(new SimpleStatement("select * from user where user_id = " + i)); watch.Stop(); return new Tuple<RowSet, long>(rows, watch.ElapsedMilliseconds); }); tasks.Add(task); } try { Task.WaitAll(tasks.ToArray()); } catch (Exception e) { Console.WriteLine("Error: " + e.Message + ", " + e.StackTrace); } int success = 0; int successWith0Results = 0; int error = 0; foreach (Task<Tuple<RowSet, long>> t in tasks) { if (t.Status == TaskStatus.RanToCompletion) { var count = t.Result.Item1.ToList().Count; if (count > 0) { success++; } else if (count == 0) { successWith0Results++; } Console.WriteLine("Time in task: " + t.Result.Item2); } else if (t.Exception != null) { error++; } } Console.WriteLine($"results > 0: {success}, results = 0: {successWith0Results}, error: {error}"); Task.WhenAll(metricsRoot.ReportRunner.RunAllAsync()).Wait(); }

While we could utilize the work around given in https://datastax-oss.atlassian.net/browse/CSHARP-553 to adjust the minimum threads, we feel like this solution is not optimal as the new threads that are spawned adds additional overhead that should not be required.

ThreadPool.SetMinThreads(500, 500);

 

We do not see the same issue when utilizing other frameworks such as EntityFrameworkCore. If we look at the source code for their synchronous functions then we find that they do not call the async version and block. Instead the synchronous functions call further synchronous functions to perform the operation. For example, you can look at the Find and FindAsync operations:

https://github.com/dotnet/efcore/blob/82a0c46de7c38e17f97a5923c63ea35030210c19/src/EFCore/Internal/EntityFinder.cs#L48

 

/// <summary> /// This is an internal API that supports the Entity Framework Core infrastructure and not subject to /// the same compatibility standards as public APIs. It may be changed or removed without notice in /// any release. You should only use it directly in your code with extreme caution and knowing that /// doing so can result in application failures when updating to a new Entity Framework Core release. /// </summary> public virtual TEntity? Find(object?[]? keyValues) => keyValues == null || keyValues.Any(v => v == null) ? null : (FindTracked(keyValues!, out var keyProperties) ?? _queryRoot.FirstOrDefault(BuildLambda(keyProperties, new ValueBuffer(keyValues))));

 

/// <summary> /// This is an internal API that supports the Entity Framework Core infrastructure and not subject to /// the same compatibility standards as public APIs. It may be changed or removed without notice in /// any release. You should only use it directly in your code with extreme caution and knowing that /// doing so can result in application failures when updating to a new Entity Framework Core release. /// </summary> public virtual ValueTask<TEntity?> FindAsync(object?[]? keyValues, CancellationToken cancellationToken = default) { if (keyValues == null || keyValues.Any(v => v == null)) { return default; } var tracked = FindTracked(keyValues!, out var keyProperties); return tracked != null ? new ValueTask<TEntity?>(tracked) : new ValueTask<TEntity?>( _queryRoot.FirstOrDefaultAsync(BuildLambda(keyProperties, new ValueBuffer(keyValues)), cancellationToken)); }

Sample EFCore console application invoking Find method:

static void PerformEFCoreTest(int start, int end) { IList<Task> tasks = new List<Task>(); for (int i = start; i < end; i++) { var task = Task.Run(() => { var watch = Stopwatch.StartNew(); Person p = null; using (var context = new SampleContext()) { p = context.Person.Find(i); } watch.Stop(); return new Tuple<Person, long>(p, watch.ElapsedMilliseconds); }); tasks.Add(task); } try { Task.WaitAll(tasks.ToArray()); } catch (Exception e) { Console.WriteLine("Error: " + e.Message + ", " + e.StackTrace); } int success = 0; int successWith0Results = 0; int error = 0; foreach (Task<Tuple<Person, long>> t in tasks) { if (t.Status == TaskStatus.RanToCompletion) { var pk = t.Result.Item1.PersonPK; if (pk > 0) { success++; } Console.WriteLine("Time in task: " + t.Result.Item2); } else if (t.Exception != null) { error++; } } Console.WriteLine($"results > 0: {success}, results = 0: {successWith0Results}, error: {error}"); }

Please advise if the driver offers true synchronous operation or can the driver only be used for asynchronous operations?

Environment

ASP.Net Core Windows 10

IIS Web Application

Activity

Show:
Joao Reis
January 17, 2022 at 2:46 PM

The synchronous operation of the driver is not suitable for applications with high throughput/concurrency requirements. As you say, it is just an inefficient wrapper around the async API of the driver as the driver is designed completely async from the ground up.

Details

Assignee

Reporter

Labels

Priority

Created January 16, 2022 at 8:04 PM
Updated January 17, 2022 at 2:48 PM