csvsolrlucenesolrcloudsolr6

Solr Indexing using CSV Update UI: unique key is created as a multivalued fields errors out


I am trying to load the CSV file in the solr 6.5 collection, using the solr Admin UI. Here are the steps that I did and got the following error:

  1. Created a data driven managed schema config set in Zookeeper. Changed the unique key to "MyId" (String field) instead of default id.

<uniqueKey>MyId</uniqueKey>
        ...
<field name="MyId" type="string" indexed="true" stored="true" required="true" multiValued="false" />
  1. Created collection and associated the config set mentioned above (using new Admin UI).

  2. Load the CSV file using Admin UI (collections --> collection name drop down --> Documents). I have added request handler parameter of &rowid=MyId parameters. My CSV file has MyId field in it. During the load I get this error:

    Document contains multiple values for uniqueKey field: MyId=[82552329, 1] at org.apache.solr.update.AddUpdateCommand.getHashableId(AddUpdateCommand.java:168)

  3. Without changing the unique ID and just using the default id (with auto generated UUID) field the csv loading fine. But I need the unique id to be MyId

I would like to know why my key field is reported as multi-valued, my CSV does not really contain multi-valued data, it is simple comma separated numeric and string fields. Please suggest what could have gone wrong.

Note: I have made this change as well Solr Schemaless Mode creating fields as MultiValued in the schema (does not help, as the problem is input data)

EDIT: Adding full exception trace

https://pastebin.com/raw/juRj7ZUi


Solution

  • I got a clue in the documentation csv update params that the issues is something to do with this param that i pass ( &rowid=MyId). As the documentation states that we should pass this paramater to add the line number as the id. That explains why my key (MyId) becomes a multi valued ([my actual key, line no.]). But then if i remove this param it was giving an error that id is not being populate. This means that it was expecting an id field. So added &literal.id=1, now everything works fine ( This is because in my schema there is required id field.). Thanks for helping out.