In RavenDB 4 (v4.0.3-patch-40031) I have two Document types: Apple
and Orange
. Both have similar, but also distinct, properties. I run into a bug in my code at runtime where sometimes an the ID of an Apple is provided, but an Orange is returned. Scary!
Diving into it, it somewhat makes sense. But I'm struggling with an appropriate solution.
Here goes. In RavenDB, I have stored a single Apple
as a Document:
id: "078ff39b-da50-4405-9615-86b0d185ba17"
{
"Name": "Elstar",
"@metadata": {
"@collection": "Apples",
"Raven-Clr-Type": "FruitTest.Apple, FruitTest"
}
}
Assume for the sake of this example that I have no Orange
documents stored in the database. I would expect this test to succeed:
// arrange - use the ID of an apple, which does not exist in Orange collection
var id_of_apple = "078ff39b-da50-4405-9615-86b0d185ba17";
// act - load an Orange
var target = await _session.LoadAsync<Orange>("078ff39b-da50-4405-9615-86b0d185ba17");
// assert - should be null, because there is no Orange with that Id
target.Should().BeNull(because: "provided ID is not of an Orange but of an Apple");
... but it fails. What happens is that the Document ID exists, so the RavenDB loads the document. Not caring what type it is. And it attempts to map the properties automatically. I expected, or assumed incorrectly, that the Load type specifier would limit the lookup to that particular document collection. Instead, it grabs + maps it throughout the entire database, not constraining it to type <T>
. So the behaviour is different from .Query<T>
, which does constraint to collection.
Important to note is that I'm using guids as identity strategy, by setting the Id to
string.Empty
(conform the docs). I assume the default ID strategy, which is likeentityname/1001
, would not have this issue.
The docs on Loading Entities don't really mention if this is intentional, or not. It only says: "download documents from a database and convert them to entities.".
However, for reasons, I do want to constrain the Load operation to a single collection. Or, better put, as efficiently as possible load a document by ID, from a specific collection. And if it does not exist, return null.
AFAIK, there are two options to achieve this:
.Query<T>.Where(x => x.Id == id)
, instead of .Load<T>(id)
.Load<T>(id)
first and then check (~somehow, see bottom) if it is part of collection TMy problem can be summarized in two questions:
Especially for the second question, it is very hard to correctly measure this properly. As for stability, e.g. not having side effects, that is something that I guess someone with more in-depth knowledge or experience of the RavenDB internals might shed some light on.
N.B. The question assumes that the explained behaviour is intentional and not a RavenDB bug.
~Somehow would be:
public async Task<T> Get(string id)
{
var instance = await _session.LoadAsync<T>(id);
if (instance == null) return null;
// the "somehow" check for collection
var expectedTypeName = string.Concat(typeof(T).Name, "s");
var actualTypeName = _session.Advanced.GetMetadataFor(instance)[Constants.Documents.Metadata.Collection].ToString();
if (actualTypeName != expectedTypeName)
{
// Edge case: Apple != Orange
return null;
}
return instance;
}
UPDATE 2018/04/19 - Added this reproducible sample after helpful comments (thanks for that).
Models
public interface IFruit
{
string Id { get; set; }
string Name { get; set; }
}
public class Apple : IFruit
{
public string Id { get; set; }
public string Name { get; set; }
}
public class Orange : IFruit
{
public string Id { get; set; }
public string Name { get; set; }
}
Tests
E.g. throws InvalidCastException in same session (works), but in second it doesn't.
public class UnitTest1
{
[Fact]
public async Task SameSession_Works_And_Throws_InvalidCastException()
{
var store = new DocumentStore()
{
Urls = new[] {"http://192.168.99.100:32772"},
Database = "fruit"
}.Initialize();
using (var session = store.OpenAsyncSession())
{
var apple = new Apple
{
Id = Guid.NewGuid().ToString(),
Name = "Elstar"
};
await session.StoreAsync(apple);
await session.SaveChangesAsync();
await Assert.ThrowsAsync<InvalidCastException>(() => session.LoadAsync<Orange>(apple.Id));
}
}
[Fact]
public async Task Different_Session_Fails()
{
var store = new DocumentStore()
{
Urls = new[] {"http://192.168.99.100:32772"},
Database = "fruit"
}.Initialize();
using (var session = store.OpenAsyncSession())
{
var appleId = "ca5d9fd0-475b-41de-a1ab-57bb1e3ce018";
// this *should* break, because... it's an apple
// ... but it doesn't - it returns an ORANGE
var orange = await session.LoadAsync<Orange>(appleId);
await Assert.ThrowsAsync<InvalidCastException>(() => session.LoadAsync<Orange>(appleId));
}
}
}
well, i found what should be the problem but i don't understand why.
you said:
by setting the Id to string.Empty
but in the example you wrote Id = Guid.NewGuid().ToString()
;
in my tests i explicitly assign string.Empty
and i get the cast exception, when i assigned the generated Guid to the entity (like you) i reproduced your situations. Probably ravendb makes some different considerations in these two cases that creates this behavior, i don't know if it could be considered a bug.
Then use string.Empty