-
Notifications
You must be signed in to change notification settings - Fork 240
CSHARP-994 Add AsyncEnumerable support to RowSet #617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I agree that making |
Irt .NET targets it would make more sense to move from .NET Standard 2.0 to .NET Standard 2.1 as that would allow us to keep supporting .NET platforms that support .NET standard while at the same time being able to leverage features like |
Also the mapper not relying on |
This is tricky because In my company, I have applied some work arounds where the performance impact was alarming. The first one was to greatly increase the page size so all the rows would fit on a single page. That greatly impacted the GC of the client but it also impacted the server and I had to find another work around when the team operating cassandra came to us. The second one was this var rows = await _session.ExecuteAsync(stmt, profile);
while (!rows.IsFullyFetched)
{
await rows.FetchMoreResultsAsync();
}
return rows.Select(r =>
{
var x = r.GetValue<int>(0);
var y = r.GetValue<string>(1);
var z = r.GetValue<byte[]>(2);
return new MyObject(x, y, z);
}).ToArray(); This seems to be fix the thread starvation too and is more friendly to the GC. Though, all the rows must be in memory at some point. But anyway it's just a work-around that most developers won't know about.
One good thing about that
I believe a serious discussion should be had around which .NET version to target. I'm sure you have many clients still using .NET Framework but double targeting the LTS could allow developers using more recent features of .NET while still support old ones. |
Have you tried manual paging as described in the manual? That should fix the issue but it's a bit more code than relying on automatic paging. This is generally the best approach if you're concerned about thread starvation and GC pressure. I think we can move from |
Thx, I'll give it a try!
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this change 👍 , I've left a few comments that need to be addressed.
Justfyi I'm not sure when we will be able to get a new release out so in the mean time it might be better for you to use manual paging on your app to fix your current issues with thread starvation
src/Cassandra.IntegrationTests/Cassandra.IntegrationTests.csproj
Outdated
Show resolved
Hide resolved
Assert.AreEqual(pageSize, ids.Count); | ||
//Retrieve the next page | ||
var rs2 = Session.Execute(ps.Bind().SetAutoPage(false).SetPagingState(rs.PagingState)); | ||
Assert.Null(rs2.PagingState); | ||
var ids2 = rs2.Select(r => r.GetValue<Guid>("id")).ToList(); | ||
var ids2 = Enumerable.Select(rs2, r => r.GetValue<Guid>("id")).ToList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RowSet now implements both IEnumerable and IAsyncEnumerable which both have a select method. I double checked if it was an issue with my AsyncEnumerable implementation but if you create a new .NET 10 project (that has AsyncEnumerable built-in) and have a class that implements both interface, you'll have this error too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the driver project only targets netstandard 2.1 so it will never have the built in AsyncEnumerable
right? Or is this an issue if a client application targeting net10 installs the driver which uses netstandard2.1 and then tries to call Select
on the RowSet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the driver project only targets netstandard 2.1 so it will never have the built in AsyncEnumerable right?
Correct, but I provided my own implementation of AsyncEnumerable. It's internal but internals are visible to the test assemblies.
Or is this an issue if a client application targeting net10 installs the driver which uses netstandard2.1 and then tries to call Select on the RowSet?
Yes this could be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this an issue if a client application targeting net10 installs the driver which uses netstandard2.1 and then tries to call Select on the RowSet?
Yes this could be an issue.
The custom AsyncEnumerable implementation is internal, how would it affect applications installing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the driver project only targets netstandard 2.1 so it will never have the built in AsyncEnumerable right?
Correct, but I provided my own implementation of AsyncEnumerable. It's internal but internals are visible to the test assemblies.
But the test assemblies don't target net10 currently so it shouldn't be an issue now right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will indeed not impact projects referencing Cassandra because it's internal. It will impact Cassandra + test projects because they all have access to the this internal class.
.Select(r => new object[] { r.GetValue<int>(0), r.GetValue<string>(1) }) | ||
.OrderBy(arr => (int)arr[0]) | ||
.ToArray(); | ||
var result = Enumerable.Select( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -128,7 +128,7 @@ public void RandomValuesTest() | |||
|
|||
var selectQuery = $"SELECT id, timeuuid_sample, {GetToDateFunction()}(timeuuid_sample) FROM {AllTypesTableName} LIMIT 10000"; | |||
Assert.DoesNotThrow(() => | |||
Session.Execute(selectQuery).Select(r => r.GetValue<TimeUuid>("timeuuid_sample")).ToArray()); | |||
Enumerable.Select(Session.Execute(selectQuery), r => r.GetValue<TimeUuid>("timeuuid_sample")).ToArray()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/Cassandra/Mapping/AppliedInfo.cs
Outdated
using System.Threading.Tasks; | ||
|
||
#if !NETSTANDARD2_1_OR_GREATER | ||
#pragma warning disable CS1998 // Async method lacks 'await' operators and will run synchronously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because FromRowSetAsync
is marked as async but never awaits if it's NETFRAMEWORK?
I'd prefer to have FromRowSet
and FromRowSetAsync
separate so someone doesn't get tricked in the future into using this assuming it's always async when it's actually not in the case of NETFRAMEWORK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will make us have more conditions in the places where this method is being called but hopefully they aren't that many
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because FromRowSetAsync is marked as async but never awaits if it's NETFRAMEWORK?
Exactly.
I'd prefer to have FromRowSet and FromRowSetAsync separate
Done
src/Cassandra/Mapping/Mapper.cs
Outdated
@@ -24,6 +24,10 @@ | |||
using Cassandra.SessionManagement; | |||
using Cassandra.Tasks; | |||
|
|||
#if !NETSTANDARD2_1_OR_GREATER | |||
#pragma warning disable CS1998 // Async method lacks 'await' operators and will run synchronously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same scenario as the previous one, I prefer to have methods separate so it's more clear when things are async and when they aren't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case, since they are public methods, I didn't add new methods but made their implementations differ depending on the target framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this #pragma
if you make the async
word as part of the conditional block in the methods where this applies.
e.g.
return ExecuteAsyncAndAdapt(cql,
#if !NETFRAMEWORK
async (s, rs) =>
{
var row = await rs.FirstOrDefaultAsync().ConfigureAwait(false);
#else
(s, rs) =>
{
var row = rs.FirstOrDefault();
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's what I went for. It's a little more tricky because the return statement should also be changed (return
VS return Task.FromResult
)
src/Extensions/Cassandra.AppMetrics/Cassandra.AppMetrics.csproj
Outdated
Show resolved
Hide resolved
src/Cassandra.Tests/DataStax/Insights/MessageFactories/InsightsMessageFactoryTests.cs
Show resolved
Hide resolved
Reading through this page I see there is a community maintained implementation of these methods, shouldn't we use this package instead? |
They even have the workaround listed there for net10 users |
I think it's important to keep the dependencies to a minimum. Here we are talking about ~100 lines of very easy code so I would greatly prefer that over an extra dependency. |
@joao-r-reis is that PR still on your radar? |
Any async method in
Mapper
can block the thread if there are more than one page inRowSet
. This can easily starve the thread pool and obliterate the performance of your app.The proper fix it to make
RowSet
implementIAsyncEnumerable
. The latter is only available in .NET Core 3 though. I could make the driver target this version but it reached end of life 7 years ago. According to versionsof.net/core, the oldest active .NET version is 8 so that's what I picked.The PR does not compile right now because the .NET 8 target generates a bunch of obsolete warnings (upgraded to errors) about
System.Runtime.ConstrainedExecution
and TLS1.X. Also, tests are missing.I would like to get some first feedbacks before resuming the work, especially around adding this new target framework. I believe it's time to add new targets as .NET Standard is now 8 years old.