While indexing my data, I found that some nested documents are not stored correctly. I run Solr 8.3 and make use of the labelled relations as described in the docs.
Whenever a root entity, Parent
, has any amount of Child
entities, I generate the following PHP array:
[
'id' => 'b14ac9a0-e255-468b-a673-e125fd73d6f2',
'entity_type' => 'parent',
'title_t' => 'Andrea Cook',
'children' => [
0 => [
'id' => 'ce10380c-8006-4945-9078-296116ad5ab7',
'entity_type' => 'child',
'title_t' => 'Jordan Gibson',
],
1 => [
'id' => '0c191119-fae9-452e-aca2-b724a381f939',
'entity_type' => 'child',
'title_t' => 'Jane Gordon',
],
]
]
This is then encoded to the following JSON object:
{
"id": "b14ac9a0-e255-468b-a673-e125fd73d6f2",
"entity_type": "parent",
"title_t": "Andrea Cook",
"children": [
{
"id": "ce10380c-8006-4945-9078-296116ad5ab7",
"entity_type": "child",
"title_t": "Jordan Gibson"
},
{
"id": "0c191119-fae9-452e-aca2-b724a381f939",
"entity_type": "child",
"title_t": "Jane Gordon"
}
]
}
Which is in turn exactly what is returned by solr queries (plus the auto-generated values __root__
, __nest_path__
, etc).
Whenever a Parent
only has one Child
, they end up as an object in solr, instead of an array containing a single object.
{
"id": "b14ac9a0-e255-468b-a673-e125fd73d6f2",
"entity_type": "parent",
"title_t": "Andrea Cook",
"children": [
{
"id": "ce10380c-8006-4945-9078-296116ad5ab7",
"entity_type": "child",
"title_t": "Jordan Gibson"
}
]
}
{
"id": "b14ac9a0-e255-468b-a673-e125fd73d6f2",
"entity_type": "parent",
"title_t": "Andrea Cook",
"children": {
"id": "ce10380c-8006-4945-9078-296116ad5ab7",
"entity_type": "child",
"title_t": "Jordan Gibson"
}
}
I have made sure that the php array and JSON object are correctly formed until the moment they are passed to the HTTP client.
I have made sure that the array keys for children
arrays are numbered and start at 0.
Is that the expected behaviour?
Do I have to create a <fieldType/>
for the labeled relations (i.e. create a multivalued children
field)? If yes, then how would I do that? I haven't found any explanation yet.
What can I do to always get an array for children
in my search results, so that I don't have to check the data before iterating over it?
It finally occurred to me, that the array manipulation I programmed in PHP (array_map
, array_merge
, ...) were creating arrays with weird indices.
In order to fix that, I had to call array_values
, which creates pristine, zero-indexed arrays.