solrdih

SolR's Data Import Handler tracks but ignores nested entity's changes


I have two tables and I'm trying to make Data Import Handler to update the index of the document when the sub-entity changes. When I fire the "delta-import" command, I get the following:

{
  "responseHeader":{
    "status":0,
    "QTime":3},
  "initArgs":[
    "defaults",[
      "config","db-data-config.xml"]],
  "command":"delta-import",
  "status":"idle",
  "importResponse":"",
  "statusMessages":{
    "Total Requests made to DataSource":"5",
    "Total Rows Fetched":"3",
    "Total Documents Processed":"0",
    "Total Documents Skipped":"0",
    "Delta Dump started":"2021-08-16 11:05:47",
    "Identifying Delta":"2021-08-16 11:05:47",
    "Deltas Obtained":"2021-08-16 11:05:47",
    "Building documents":"2021-08-16 11:05:47",
    "Total Changed Documents":"0",
    "Time taken":"0:0:0.12"}}

My data config is this:

<dataConfig>
    <dataSource driver="org.mariadb.jdbc.Driver" url="jdbc:mysql://localhost:3306/eepyakm?user=root" user="root" password="root"/>
    <document>
        <entity name="supplier" query="select * from suppliers_tmp_view"
                deltaQuery="select id from suppliers_tmp_view where last_modified > '${dataimporter.last_index_time}'"
                deltaImportQuery="select * from suppliers_tmp_view where id='${dataimporter.delta.id}'">
                
                
            <entity name="attachment"  
                    query="select * from suppliers_tmp_files_view where supplier_tmp_id='${supplier.id}'"
                    deltaQuery="select id from suppliers_tmp_files_view where last_modified > '${dataimporter.last_index_time}'"
                    parentDeltaQuery="select id from suppliers_tmp_view where id='${attachment.supplier_tmp_id}'">
                <field name="path" column="path" />
            </entity>
            
        </entity>
    </document>
</dataConfig>

In my understanding, "Total Rows Fetched" shows that 3 entries in the sub-entity table have changed. So, why doesn't it index the changed field?

If I do a "full-import" it picks the changes fine.


Solution

  • Neither of your queries do include a supplier_tmp_id - but you still reference this in your parentDeltaQuery.

    You want to select this column as well in your SELECT statement.