sql-server asp.net-core entity-framework-core asp.net-core-webapi

Still have dirty reads with read committed isolation level

I have existing code I'm debugging that is occasionally causing primary key violations on adding the entity. I don’t think it should be allowing dirty reads but it is.

This is the pseudo code:

Controller Method -> await _service.AddOrUpdateAsync(my entity)
    using var transaction = 
    await _repository.BeginTransaction( ) 
        -> await DbSet.BeginTransactionAsync( read committed ))
        
    if (await _repository.Exists(entity primary key field values)) 
        -> await DbSet.AnyAsync(fields)
            
        await _repository.UpdateAsync(my entity)
            -> await DbSet.UpdateAsync(my entity)
            -> await Context.SaveChangesAsync( )

    else
        await _repository.Add(my entity) 
            -> await DbSet.AddAsync(my entity) // see notes below
            -> await Context.SaveChangesAsync( )
            
    await transaction.CommitAsync( )

One thing I've noticed is MS notes on the DbSet.AddAsync() method.

This method is async only to allow special value generators, such as the one used by 'Microsoft.EntityFrameworkCore.Metadata.SqlServerValueGenerationStrategy.SequenceHiLo', to access the database asynchronously. For all other cases the non async method should be used.

Another potential issue I see is in the dependency injection setup.

services.AddDbContext<MyContext>(defaults to scoped lifetime)

services.AddTransient<MyService>
services.AddTransient<MyRepository>

They add the Db Context which defaults to scoped lifetime but for the service and repository they set transient lifetimes. Could this be the issue?

This API is called by two different applications. One is another API and one is a console application. I see occasional deadlocks but that can be expected with the read committed isolation level.

Should I try raising the isolation level?

I cannot reproduce this locally so it's hard to test.

Solution

These aren't dirty reads. You're using the read committed isolation level, which means that if two processes check for existence of the same row at the same time, they're both going to get the same answer - either the row exists when they both do that, or the row doesn't. (There is nothing dirty or inconsistent about that, whether it's wrapped in a read committed transaction or not. The insert hasn't happened yet, so there's nothing to wait to commit - they got the answer, there either is no committed row with that key, or there is one committed row with that key.)

Your code then says:

if the row exists, perform an update!
if the row doesn't exist, perform an insert!

This all sounds great on the surface. But neither process has to wait for the other's transaction to clear before proceeding, since it's not a serializable transaction. In this case, two things to keep in mind:

if the row exists, both processes will then try to update. Only one process can update a row at a time. The "winner" will write their changes first, but the "loser" will actually win because they will overwrite changes made by the "winner." This scenario may currently be unknown to you unless someone noticed that their changes didn't "stick."
if the row doesn't exist, both processes will try to insert. That is the scenario you are seeing now with primary key violations.

What you need to do is to make sure that the check for existence of the row holds that lock until the transaction is committed, so that the other process can't even check for the row until the first process is done. This affects concurrency, but avoids the potential for primary key violations or deadlocks.

EF Core AFAIK does not provide a truly safe way to do this. I suggested off the cuff just changing the declaration from read committed to serializable but David Browne correctly pointed out that this will potentially just trade PK violations for deadlocks.

I think what you should do is give up on trying to have EF Core manage your transaction, use raw SQL (or a stored procedure!), and change the logic slightly. (And maybe even stop using async/await; this doesn't seem like the right use case to me.) Instead of:

check if a row exists with that key
- if yes, update
- if no, insert

Simplify that from exactly two touches every time to at most two touches:

try to update the row (serializable!) with that key
- if it affects a row, you're done! Commit!
- if it affects 0 rows, insert! Commit!

This will make the transaction faster in some cases, which will also help reduce contention. This will be especially valuable if, most of the time, two processes aren't trying to upsert the same key. In practice, it should be like this (a lot more detail in this post about upsert patterns, and some of the links that post presents):

BEGIN TRANSACTION;

UPDATE dbo.table_name WITH (UPDLOCK, SERIALIZABLE) 
   SET col1 = @val1
       /* ,... */
 WHERE [key] = @key;

IF @@ROWCOUNT = 0
BEGIN
  INSERT dbo.table_name
  (
    [key], col1 /* --, ... */
  ) 
  VALUES
  (
    @key, @val1 /* --, ... */
  );
END

COMMIT TRANSACTION;

/* error handling omitted for brevity */

I'm often asked why both hints on the update:

UPDLOCK protects against conversion deadlocks at the statement level. Essentially, this makes other processes wait instead of try/retry.
SERIALIZABLE protects against changes to the key range for the duration of the transaction. In other words, it ensures that a row that didn't exist at the beginning of the transaction can't be added before the end of the transaction.

An added benefit of this approach (as opposed to just making the whole transaction serializable) is that, in normal scenarios, a process that is inserting one key shouldn't block a process that is updating a different key, and vice-versa.

This approach doesn't (and can't!) resolve the situation where two processes try to update the same row "at the same time" with different data. Even if attempted at the same nanosecond, one will win, one will lose, and the winner's changes will be lost (written but then discarded). This comes down to business logic problems:

Figuring out why two processes are trying to upsert a row belonging to the same key, at the same time, in the first place.
Figuring out which process should have the winning update, if not just based on last writer wins.

You can read up on the various isolation levels, what they promise, and what they don't, in this series from Paul White: https://sql.kiwi/2014/07/isolation-levels

And once again, just to be 100% clear: you are not experiencing "dirty reads." You are just expecting the read part of your transaction to stay consistent after the read is finished, and that just doesn't happen under read committed.