csvsolrimport-from-csvsolr8

Solr Index Handlers with Multivalue Field


I want to import a CSV into solr via Index Handlers like described in the docs: https://solr.apache.org/guide/7_1/uploading-data-with-index-handlers.html#csv-update-parameters

I have a CSV with the following structure:

ID    |    Name    |    Property    |
1     |    Tee     |     Sweet      |
1     |    Tee     |     Fluid      |
1     |    Tee     |      Hot       |
2     |   Bread    |     Salty      |
3     |    Milk    |     Fluid      |

The first values are always equal if the ID is the same, only the property varies. Now I want to import the property as a multivalue Field to solr.

Is there any way to achieve this with a Index Handler. If not, how else?


Solution

  • I will write a program to scan through the CSV data and produce the JSON objects that you can ingest into Solr. This will require you to scan all rows in the CSV file so that you can aggregate properties for rows with the same ID that way you will end up with a JSON like this:

    [
    {id: 1, name: "Tree", properties: ["Sweet", "Fluid", "Hot"]},
    {id: 2, name: "Bread", properties: ["Salty"]},
    {id: 3, name: "Milk", properties: ["Fluid"]}
    ]
    

    You will want to use field names that match your schema or your dynamic field definitions so that they are indexed properly too.