I am trying to use solr with DIH to index csv files. I've patched my DIH library using patch SOLR-2549 mentioned on the solr wiki (see http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1) in order to import csv files without using Transformers along with LineEntityProcessor.
Unfortunately, I could not get my import work and I have the following error stack:
INFO: [csv] webapp=/solr path=/dataimport params={command=full-import&optimize=false&clean=true&commit=true&verbose=true} status=0 QTime=33 {deleteByQuery=*:*} 0 33
7 nov. 2012 14:16:03 org.apache.solr.common.SolrException log
GRAVE: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:413)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:326)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:234)
... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:542)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:411)
... 5 more
Caused by: java.lang.NullPointerException
at org.apache.solr.handler.dataimport.LineEntityProcessor.initDelimitedOrFixedWidth(LineEntityProcessor.java:142)
at org.apache.solr.handler.dataimport.LineEntityProcessor.init(LineEntityProcessor.java:115)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:74)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:430)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:498)
... 6 more
I think it's related to my data configuration. This is my data-config.xml file:
<dataConfig>
<dataSource name="dfs" type="FileDataSource"/>
<document>
<entity name="sourcefile"
processor="FileListEntityProcessor"
fileName="rocinter.csv"
rootEntity="false"
baseDir="/user/xxx/work/solr/example/example-DIH/solr/csv/inputfolder"
>
<entity name="entryline"
processor="LineEntityProcessor"
url="${sourcefile.fileAbsolutePath}"
rootEntity="true"
dataSource="fds"
separator=","
>
</entity>
</entity>
</document>
</dataConfig>
Could anybody help me undestand this issue or provide a clear config file using patched LineEntityProcessor version to import csv files ?
I'v finally got an answer from the user mailing list. Actually that was a bug in the patch.
A newer version of the patch is attached to jira issue.
see: SOLR-2549